Default cluster topology

Data Hub uses a specific cluster topology including the following host groups: master, worker, and compute.

Data Hub uses the following host groups. These host groups are defined in cluster templates and cluster definitions used by Data Hub:
Host group Description Number of nodes
Master The master host group runs the components for managing the cluster resources (including Cloudera Manager), storing intermediate data (e.g. HDFS), processing tasks, as well as other master components. 1
Worker The worker host group runs the components that are used for executing processing tasks (such as NodeManager) and handling storing data in HDFS such as DataNode). 1+
Compute The compute host group can optionally be used for running data processing tasks (such as NodeManager). 0+