What is Private Cloud Data ServicesPDF version

Deployment and high-level architecture

A Cloudera Data Services on premises deployment includes an Environment, a Data Lake, the Management Console, and Data Services (Data Warehouse, Machine Learning, Data Engineering). Other tools and utilities include Replication Manager, Data Recovery Service, CDP CLI, and monitoring using Grafana.

To deploy Cloudera Data Services on premises you need a Cloudera Base on premises cluster, along with container-based clusters that run the Data Services. You can either use a dedicated RedHat OpenShift container cluster or deploy an Cloudera Embedded Container Service container cluster.

The on premises deployment process involves configuring Cloudera Management Console, registering an environment by providing details of the Data Lake configured on the Base cluster, and then creating the workloads.

Platform Managers and Administrators can rapidly provision and deploy the data services through the Cloudera Management Console, and easily scale them up or down as required.

Image showing various components of CDP Private Cloud Base and Data Services
You can install Cloudera Base on premises on virtual machines or bare-metal hardware. Cloudera Base on premises provides the following components and services that are used by Cloudera Data Services on premises:
  • SDX Data Lake cluster for security, metadata, and governance
  • HDFS and Ozone for storage
  • Powerful and open-source Cloudera Runtime services such as Ranger, Atlas, Hive Metastore (HMS), and so on
  • Networking infrastructure that supports network traffic between storage and compute environments