Cluster Hosts and Role Assignments
This topic describes suggested role assignments for a CDH cluster managed by Cloudera Manager. The actual assignments you choose for your deployment can vary depending on the types and volume of work loads, the services deployed in your cluster, hardware resources, configuration, and other factors.
When you install CDH using the Cloudera Manager installation wizard, Cloudera Manager attempts to spread the roles among cluster hosts (except for roles assigned to Edge hosts) based on the resources available in the hosts. You can change these assignments on the Customize Role Assignments page that appears in the wizard. You can also change and add roles at a later time using Cloudera Manager. See Role Instances.
If your cluster uses data-at-rest encryption, see Allocating Hosts for Key Trustee Server and Key Trustee KMS.
CDH Cluster Hosts and Role Assignments
- Master hosts run Hadoop master processes such as the HDFS NameNode and YARN Resource Manager.
- Utility hosts run other cluster processes that are not master processes such as Cloudera Manager and the Hive Metastore.
- Edge hosts are client access points for launching jobs in the cluster. The number of Edge hosts required varies depending on the type and size of the workloads.
- Worker hosts primarily run DataNodes and other distributed processes such as Impalad.
Cluster Size | Master Hosts | Utility Hosts | Edge Hosts | Worker Hosts |
---|---|---|---|---|
Very Small, without High Availability
|
Master Host 1:
|
One host for all Utility and Edge roles:
|
3 - 10 Worker Hosts:
|
|
Small, with High Availability
|
Master Host 1:
Master Host 2:
|
One host for all Utility and Edge roles:
|
3 - 20 Worker Hosts:
|
|
Medium, with High Availability
|
Master Host 1:
Master Host 2:
Master Host 3:
|
Utility Host 1:
Utility Host 2:
|
One or more Edge Hosts:
|
50 - 200 Worker nodes:
|
Large, with High Availability
|
Master Host 1:
Master Host 2:
Master Host 3:
Master Host 4:
Master Host 5:
|
Utility Host 1:
Utility Host 2:
|
One or more Edge Hosts:
|
200 - 500 Worker Hosts:
|
Extra Large, with High Availability
|
Master Host 1:
Master Host 2:
Master Host 3:
Master Host 4:
Master Host 5:
|
Utility Host 1:
Utility Host 2:
Utility Host 3:
Utility Host 4:
Utility Host 5:
|
One or more Edge Hosts:
|
500 - 1000 Worker Hosts:
|
Allocating Hosts for Key Trustee Server and Key Trustee KMS
If you are enabling data-at-rest encryption for a CDH cluster, Cloudera recommends that you isolate the Key Trustee Server from other enterprise data hub (EDH) services by deploying the Key Trustee Server on dedicated hosts in a separate cluster managed by Cloudera Manager. Cloudera also recommends deploying Key Trustee KMS on dedicated hosts in the same cluster as the EDH services that require access to Key Trustee Server. This architecture helps users avoid having to restart the Key Trustee Server when restarting a cluster.
See Deployment Planning for Data at Rest Encryption.
For production environments in general, or if you have enabled high availability for HDFS and are using data-at-rest encryption, Cloudera recommends that you enable high availability for Key Trustee Server and Key Trustee KMS.