CDP Private Cloud Data Services Components
A CDP Private Cloud Data Services deployment comprises components such as an environment, a Data Lake, the Management Console, and Data Services such as Data Warehouse, Machine Learning, Data Engineering, and Replication Manager.
- A logical entity that represents the association of your Private Cloud user account with compute resources using which you can provision and manage workloads such as Data Warehouse and Machine Learning.
- Data Lake
- Creates a protective ring of security and governance around your data. For a CDP Private Cloud deployment, the Data Lake services are hosted on the CDP Private Cloud Base cluster. In addition, the Data Lake services are shared between multiple workloads.
- Management Console
- A service for administering CDP. As a CDP administrator, you can use Management Console for managing environments, data lakes, environment resources, and users across all Private Cloud Data Services.
- Data Warehouse
- This data service enables an enterprise to provision a new data warehouse and share a subset of the data with a specific team or department. A Data Warehouse cluster can be created from the Management Console and accessed by end users (data analysts).
- Machine Learning
- This data service enables teams of data scientists to develop, test, train, and ultimately deploy machine learning models for building predictive applications all on the data under management within the enterprise data cloud. A Machine Learning workspace can be created from the Management Console and accessed by end users (data scientists).
- Data Engineering
- This data service allows you to create, manage, and schedule Apache Spark jobs without the overhead of creating and maintaining Spark clusters. You can define virtual clusters with a range of CPU and memory resources, and the cluster scales up and down as needed to execute your Spark workloads, helping to control your cloud costs.
- Replication Manager
- You can use the Replication Manager service to copy and migrate data (HDFS and Hive external tables) between CDP Private Cloud Base and CDP Private Cloud Data Services clusters using theEmbedded Container Service(ECS).