What is Private Cloud Data ServicesPDF version

Cloudera Data Services on premises glossary

The Cloudera Data Services on premises documentation uses terms related to enterprise data cloud and cloud computing.

CDP CLI:
A command-line interface to access and manage Cloudera services and resources.
Cloudera Base on premises:
An on-premises version of Cloudera. It combines the best of Cloudera Enterprise Data Hub (CDH) and Hortonworks Data Platform Enterprise (HDP) along with new features and enhancements across the stack.
Cloudera Manager for Data Services:
A tool to install, manage, monitor, and configure your clusters and services using the Cloudera Manager Admin Console web application or the Cloudera Manager API.
Cloudera Runtime:
The core open source software distribution within Cloudera Base on premises. It includes approximately 50 open-source projects providing data management tools within Cloudera.
Cloudera Control Plane service:
Cloudera service that includes services like Cloudera Management Console, Replication Manager, Data Recovery Service, Service Discovery Service, and so on. These services interact with your environment on Cloudera Embedded Container Service or OCP to provision and manage compute infrastructure that you can use to manage the lifecycle of data stored on HDFS or Ozone.
Cloudera Data Catalog:

This data service allows the Data Stewards, Business analysts, and Data Administrators to curate different datasets by adding multiple assets in it.

Users can scan and profile information for a data lake to gather metadata about schema and identify sensitive information. The metadata is gathered by the profilers and stored in Cloudera Apache Atlas to be retrieved during search and discovery operations. Users are allowed to define and tag assets using custom rules to identify and classify assets for their business requirements and discover metadata in Asset details page about a given asset and also tag and group various assets to create a dataset. Users can search and manage datasets based on various attributes.

Cloudera Data Engineering:
This data service allows you to create, manage, and schedule Apache Spark jobs without the overhead of creating and maintaining Spark clusters. You can define virtual clusters with a range of CPU and memory resources, and the cluster scales up and down as needed to execute your Spark workloads, helping to control your cloud costs.
Data Lake:
Creates a protective ring of security and governance around your data. For a Cloudera on premises deployment, the Data Lake services are hosted on the Cloudera Base on premises cluster. In addition, the Data Lake services are shared between multiple workloads.
Data Service:
A defined subset of Cloudera functionality that enables a Cloudera user to solve a specific problem related to their data lake (process, analyze, predict, and so on). Example services: Cloudera Data Engineering, Cloudera Data Warehouse, Cloudera AI.
Cloudera Data Warehouse:
This data service enables an enterprise to provision a new Cloudera Data Warehouse and share a subset of the data with a specific team or department. A Cloudera Data Warehouse cluster can be created from the Cloudera Management Console and accessed by end users (data analysts).
Environment:
A logical entity that represents the association of your on premises user account with compute resources using which you can provision and manage workloads such as Cloudera Data Warehouse and Cloudera AI.
Cloudera AI:
This data service enables teams of data scientists to develop, test, train, and ultimately deploy machine learning models for building predictive applications all on the data under management within the enterprise data cloud. A Cloudera AI Workbench can be created from the Cloudera Management Console and accessed by end users (data scientists).
Cloudera Management Console:
The user interface for administering Cloudera Data Services on premises. As a Cloudera administrator, you can use Cloudera Management Console to manage environments, data lakes, environment resources, and users across all Cloudera Data Services on premises.
Replication Manager:
This data service allows you to copy and migrate HDFS data, Hive external tables, and Ozone data between Cloudera Base on premises 7.1.8 or higher clusters using Cloudera Manager version 7.7.3 or higher.