How to create a Compute Cluster and Data Context.
Minimum Required Role:
Cluster
Administrator (also provided by Full
Administrator) This feature is not available when using Cloudera
Manager to manage Data Hub clusters.
To create a Compute cluster, you must have a Regular cluster that will
be designated as the Base cluster. This cluster hosts the data services
to be used by a Compute cluster and can also host services for other
workloads that do not require access to data services defined in the
Data Context.
To create a Compute cluster:
- On the Cloudera Manager home page, click
The
Add Cluster Welcome page
displays.
- Click Continue. .
The
Cluster Basics page
displays
- Select Compute cluster.
- If you already have a Data Context defined, select it from the
drop-down list.
- To create a new Data Context:
- Select Create Data Context from
the drop-down list.
The Create Data
Context dialog box displays.
- Enter a unique name for the Data
Context.
- Select the Base cluster from the drop-down
list.
- Select the Data Services, Metadata Services and Security
Services you want to expose in the Data Context. You can choose
the following:
- HDFS (required)
- Hive Metadata Service
- Atlas
- Ranger
- Click
Create.
The Cluster
Basics page displays your selections.
- Click Continue.
- Continue with the next steps in the Add Cluster Wizard to
specify hosts and credentials, and install the Agent and CDH software.
The Select Repository screen examines the CDH version of the base cluster and recommend
a supported version. Cloudera recommends that your Base and Compute clusters each run the
same version of CDH. The Add Cluster Wizard offers the option to choose other versions,
but these combinations have not been tested and are not supported for production
use.
- On the Select Services screen, choose any of the
pre-configured combinations of services listed on this page, or you
can select Custom Services and choose the
services you want to install.
Service combinations for Compute Clusters:
- Data Engineering
- Process develop, and serve predictive models.
- Services included: Spark, Oozie, Hive on Tez, Data Analytics
Studio, HDFS, YARN, and YARN Queue Manager
- Spark
- Spark for Compute
-
- Services included: Core Configuration, Spark, Oozie, YARN, and
YARN Queue Manager
- Streams Messaging (Simple)
- Simple Kafka cluster for streams messaging
- Services included: Kafka, Schema Registry, and Zookeeper
- Streams Messaging (Full)
- Advanced Kafka cluster with monitoring and replication services
for streams messaging
- Services included: Kafka, Schema Registry, Streams Messaging
Manager, Streams Replication Manager, Cruise Control, and
Zookeeper
- Custom Services
- Choose your own services. Services required by chosen services
will automatically be included.
The following services
can be installed on a Compute cluster:
- Hive Execution Service (This service supplies the
HiveServer2 role only.)
- Hue
- Kafka
- Spark
- Oozie (only when Hue is available, and is a requirement
for Hue)
- YARN
- HDFS
- If you have enabled Kerberos authentication on the Base
cluster, you must also enable Kerberos on the Compute
cluster.