Runtime Cluster Hosts and Role Assignments

Cluster hosts can be broadly described as master hosts, utility hosts, gateway hosts, or worker hosts.

  • Master hosts run Hadoop master processes such as the HDFS NameNode and YARN Resource Manager.
  • Utility hosts run other cluster processes that are not master processes such as Cloudera Manager and one or more Hive Metastores.
  • Gateway hosts are client access points for launching jobs in the cluster. The number of gateway hosts required varies depending on the type and size of the workloads.
  • Worker hosts primarily run DataNodes and other distributed processes such as Impala.

The following tables describe the recommended role allocations for different cluster sizes. Note that these configurations take into account services dependencies that might not be obvious. For example, running Atlas or Ranger requires also running HBase, Kafka, Solr, and ZooKeeper. For details see Service Dependences in Cloudera Manager.

3 - 10 Worker Hosts without High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • YARN ResourceManager
  • JobHistory Server
  • ZooKeeper
  • Kudu master
  • Spark History Server
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
One host for all Utility and Gateway roles:
  • Secondary NameNode
  • Cloudera Manager
  • Cloudera Manager Management Service
  • Cruise Control
  • Hive Metastore
  • HiveServer2
  • Impala Catalog Server
  • Impala StateStore
  • Hue
  • Oozie
  • Gateway configuration
  • HBase backup master
  • Ranger Admin, Tagsync, Usersync servers
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
  • Streams Messaging Manager
  • Streams Replication Manager Service
  • ZooKeeper
  • Knox: One KnoxGateway service on utility or gateway hosts.
  • S3 Gateway
3 - 10 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server
  • Kafka Broker
  • Kafka Connect
  • HBase RegionServer
  • Solr server (For Cloudera Search)
  • Streams Replication Manager Driver
  • ZooKeeper (Recommend 3 servers total)
  • Ozone DataNode

3 - 20 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • JobHistory Server
  • Kudu master
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 3:
  • Kudu master (Kudu requires an odd number of masters for HA.)
  • Spark History Server
  • JournalNode (requires dedicated disk)
  • ZooKeeper
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Utility Host 1:
  • Cloudera Manager
  • Cloudera Manager Management Service
  • Cruise Control
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
  • Ranger Admin, Tagsync, Usersync servers
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
  • Streams Messaging Manager
  • Streams Replication Manager Service
  • Knox: One KnoxGateway service for HA. Instead of utility, you can also select gateway hosts.
Utility Host 2:
  • Hive Metastore
  • Ranger Admin server
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
  • Knox: One KnoxGateway service for HA. Instead of utility, you can also select gateway hosts.
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Gateway configuration
  • S3 Gateway
3 - 20 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server
  • Kafka Broker (Recommend 3 brokers minimum)
  • Kafka Connect
  • HBase RegionServer
  • Solr server (For Cloudera Search, recommend 3 servers minimum)
  • Streams Replication Manager Driver
  • Ozone DataNode

20 - 80 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 3:
  • ZooKeeper
  • JournalNode
  • JobHistory Server
  • Spark History Server
  • Kudu master
  • HBase master
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Utility Host 1:
  • Cloudera Manager
  • Cruise Control
  • Hive Metastore
  • Ranger Admin server
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
  • Streams Messaging Manager
  • Streams Replication Manager Service
Utility Host 2:
  • Cloudera Manager Management Service
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
  • Ranger Admin, Tagsync, Usersync servers
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Gateway configuration
  • S3 Gateway
Two or more Gateway Hosts:
  • Knox: Two KnoxGateway services, one on each of the first two Gateway Hosts for HA.
  • S3 Gateway
20 - 80 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server
  • Kafka Broker (Recommend 3 brokers minimum)
  • Kafka Connect
  • HBase RegionServer
  • Solr server (For Cloudera Search, recommend 3 servers minimum)
  • Streams Replication Manager Driver
  • Ozone DataNode

80 - 200 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • YARN ResourceManager
  • ZooKeeper
  • Kudu master
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 3:
  • ZooKeeper
  • JournalNode
  • JobHistory Server
  • Spark History Server
  • Kudu master
  • HBase master
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Utility Host 1:
  • Cloudera Manager
  • Cruise Control
  • Streams Messaging Manager
  • Streams Replication Manager Service
Utility Host 2:
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
Utility Host 3:
  • Host Monitor
Utility Host 4:
  • Ranger Admin, Tagsync, Usersync servers
  • Atlas server
  • Solr server
Utility Host 5:
  • Hive Metastore
  • Ranger Admin server
  • Atlas server
  • Solr server
Utility Host 6:
  • Reports Manager
Utility Host 7:
  • Service Monitor
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Gateway configuration
  • S3 Gateway
Two or more Gateway Hosts:
  • Knox: Two KnoxGateway services, one on each of the first two Gateway Hosts for HA.
  • S3 Gateway
80 - 200 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server (Recommend 100 tablet servers maximum)
  • Kafka Broker (Recommend 3 brokers minimum)
  • Kafka Connect
  • HBase RegionServer
  • Solr server (For Cloudera Search, recommend 3 servers minimum)
  • Streams Replication Manager Driver
  • Ozone DataNode

200 - 500 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • ZooKeeper
  • Kudu master
  • HBase master
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • ZooKeeper
  • Kudu master
  • HBase master
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 3:
  • YARN ResourceManager
  • ZooKeeper
  • JournalNode
  • Kudu master
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 4:
  • YARN ResourceManager
  • ZooKeeper
  • JournalNode
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 5:
  • JobHistory Server
  • Spark History Server
  • ZooKeeper
  • JournalNode
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager

Cloudera recommends no more than three masters for Kudu and HBase.

Utility Host 1:
  • Cloudera Manager
  • Cruise Control
  • Streams Messaging Manager
  • Streams Replication Manager Service
Utility Host 2:
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
Utility Host 3:
  • Host Monitor
Utility Host 4:
  • Ranger Admin, Tagsync, Usersync servers
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
Utility Host 5:
  • Hive Metastore
  • Ranger Admin server
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
Utility Host6:
  • Reports Manager
Utility Host 7:
  • Service Monitor
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Gateway configuration
  • S3 Gateway
Two or more Gateway Hosts:
  • Knox: Two KnoxGateway services, one on each of the first two Gateway Hosts for HA.
  • S3 Gateway
200 - 500 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server (Recommend 100 tablet servers maximum)
  • Kafka Broker (Recommend 3 brokers minimum)
  • Kafka Connect
  • HBase RegionServer
  • Solr server (For Cloudera Search, recommend 3 servers minimum)
  • Streams Replication Manager Driver
  • Ozone DataNode

500 -1000 Worker Hosts with High Availability

Master Hosts Utility Hosts Gateway Hosts Worker Hosts
Master Host 1:
  • NameNode
  • JournalNode
  • FailoverController
  • ZooKeeper
  • Kudu master
  • HBase master
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 2:
  • NameNode
  • JournalNode
  • FailoverController
  • ZooKeeper
  • Kudu master
  • HBase master
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 3:
  • YARN ResourceManager
  • ZooKeeper
  • JournalNode
  • Kudu master
  • HBase master
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 4:
  • YARN ResourceManager
  • ZooKeeper
  • JournalNode
  • Schema Registry
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager
Master Host 5:
  • JobHistory Server
  • Spark History Server
  • ZooKeeper
  • JournalNode
  • Ozone Manager (A maximum of 3 Ozone Managers are supported)
  • Storage Container Manager

Cloudera recommends no more than three masters for Kudu and HBase.

Utility Host 1:
  • Cloudera Manager
  • Cruise Control
  • Streams Messaging Manager
  • Streams Replication Manager Service
Utility Host 2:
  • Hive Metastore
  • Impala Catalog Server
  • Impala StateStore
  • Oozie
Utility Host 3:
  • Host Monitor
Utility Host 4:
  • Ranger Admin, Tagsync, Usersync servers
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
Utility Host 5:
  • Hive Metastore
  • Ranger Admin server
  • Atlas server
  • Solr server (CDP-INFRA-SOLR instance to support Atlas)
Utility Host 6:
  • Reports Manager
Utility Host 7:
  • Service Monitor
One or more Gateway Hosts:
  • Hue
  • HiveServer2
  • Gateway configuration
  • S3 Gateway
Two or more Gateway Hosts:
  • Knox: Two KnoxGateway services, one on each of the first two Gateway Hosts for HA.
  • S3 Gateway
500 - 1000 Worker Hosts:
  • DataNode
  • NodeManager
  • Impalad
  • Kudu tablet server (Recommend 100 tablet servers maximum)
  • Kafka Broker (Recommend 3 brokers minimum)
  • Kafka Connect
  • HBase RegionServer
  • Solr server (For Cloudera Search, recommend 3 servers minimum)
  • Streams Replication Manager Driver
  • Ozone DataNode