Adding a Compute Cluster and Data Context
How to create a Compute Cluster and Data Context
To create a Compute cluster, you must have a Regular cluster that will be designated as the Base cluster. This cluster hosts the data services to be used by a Compute cluster and can also host services for other workloads that do not require access to data services defined in the Data Context.
To create a Compute cluster:
- On the Cloudera Manager home page, click
The Add Cluster Welcome page displays.
- Click Continue. .The Cluster Basics page displays
- Select Compute cluster.
- If you already have a Data Context defined, select it from the drop-down list.
- To create a new Data Context:
- Select Create Data Context from
the drop-down list.The Create Data Context dialog box displays.
- Enter a unique name for the Data Context.
- Select the Base cluster from the drop-down list.
- Select the Data Services, Metadata Services and Security
Services you want to expose in the Data Context. You can choose
- HDFS (required)
- Hive Metadata Service
Create.The Cluster Basics page displays your selections.
- Click Continue.
- Select Create Data Context from the drop-down list.
- Continue with the next steps in the Add Cluster Wizard to
specify hosts and credentials, and install the Agent and CDH software.
The Select Repository screen will examine the CDH version of the case cluster and recommend a supported version. Cloudera recommends that your Base and Compute clusters each run the same version of CDH. The Wizard will offer the option to choose other versions, but these combinations have not been tested and are not supported for production use.
- On the Select Services screen, choose any of the
pre-configured combinations of services listed on this page, or you
can select Custom Services and choose the
services you want to install. Service combinations for Compute Clusters:The following services can be installed on a Compute cluster:
- Data Engineering
- Process develop, and serve predictive models.
- Services included: Spark, Oozie, Hive on Tez, Data Analytics Studio, HDFS, YARN, and YARN Queue Manager
- Spark for Compute
- Services included: Core Configuration, Spark, Oozie, YARN, and YARN Queue Manager
- Streams Messaging (Simple)
- Simple Kafka cluster for streams messaging
- Services included: Kafka, Schema Registry, and Zookeeper
- Streams Messaging (Full)
- Advanced Kafka cluster with monitoring and replication services for streams messaging
- Services included: Kafka, Schema Registry, Streams Messaging Manager, Streams Replication Manager, Cruise Control, and Zookeeper
- Custom Services
- Choose your own services. Services required by chosen services will automatically be included.
- Hive Execution Service (This service supplies the HiveServer2 role only.)
- Spark 2
- Oozie (only when Hue is available, and is a requirement for Hue)
- HDFS (required)
- If you have enabled Kerberos authentication on the Base cluster, you must also enable Kerberos on the Compute cluster.