CDH Overview

CDH is the most complete, tested, and popular distribution of Apache Hadoop and related projects. CDH delivers the core elements of Hadoop – scalable storage and distributed computing – along with a Web-based user interface and vital enterprise capabilities. CDH is Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL and interactive search, and role-based access controls.

CDH provides:
  • Flexibility—Store any type of data and manipulate it with a variety of different computation frameworks including batch processing, interactive SQL, free text search, machine learning and statistical computation.
  • Integration—Get up and running quickly on a complete Hadoop platform that works with a broad range of hardware and software solutions.
  • Security—Process and control sensitive data.
  • Scalability—Enable a broad range of applications and scale and extend them to suit your requirements.
  • High availability—Perform mission-critical business tasks with confidence.
  • Compatibility—Leverage your existing IT infrastructure and investment.

For information about CDH components, which is out of scope for Cloudera documentation, see the links in External Documentation.