Managing Roles

When Cloudera Manager configures a service, it configures hosts in your cluster with one or more functions (called roles in Cloudera Manager) that are required for that service. The role determines which Hadoop daemons run on a given host. For example, when Cloudera Manager configures an HDFS service instance it configures one host to run the NameNode role, another host to run as the Secondary NameNode role, another host to run the Balancer role, and some or all of the remaining hosts to run DataNode roles.

Configuration settings are organized in role groups. A role group includes a set of configuration properties for a specific group, as well as a list of role instances associated with that role group. Cloudera Manager automatically creates default role groups.

For role types that allow multiple instances on multiple hosts, such as DataNodes, TaskTrackers, RegionServers (and many others), you can create multiple role groups to allow one set of role instances to use different configuration settings than another set of instances of the same role type. In fact, upon initial cluster setup, if you are installing on identical hosts with limited memory, Cloudera Manager will (typically) automatically create two role groups for each worker role — one group for the role instances on hosts with only other worker roles, and a separate group for the instance running on the host that is also hosting master roles.

The HDFS service is an example of this: Cloudera Manager typically creates one role group (DataNode Default Group) for the DataNode role instances running on the worker hosts, and another group (HDFS-1-DATANODE-1) for the DataNode instance running on the host that is also running the master roles such as the NameNode, JobTracker, HBase Master and so on. Typically the configurations for those two classes of hosts will differ in terms of settings such as memory for JVMs.

The Cloudera Manager configuration screens offer two layout options: classic and new. The classic layout is the default; however, on each configuration page you can easily switch between the classic and new layouts using the Switch to XXX layout link at the top right of the page. All the configuration procedures described in the Cloudera Manager documentation assume the classic layout.
  • classic - pages are organized by role group and categories within the role group. For example, to display the JournalNode maximum log size property (JournalNode Max Log Size), select JournalNode Default Group > Logs. When a configuration property has been set to a value different from the default, a Reset to the default value link displays.
  • new - pages contain controls that allow you filter configuration properties based on configuration status, category, and group.For example, to display the JournalNode maximum log size property (JournalNode Max Log Size), click the SCOPE > JournalNode and CATEGORY > Logs filters. When a configuration property has been set to a value different from the default, a reset to default value icon displays.

Gateway Roles

A gateway is a special type of role whose sole purpose is to designate a host that should receive a client configuration for a specific service, when the host does not have any roles running on it. Gateway roles enable Cloudera Manager to install and manage client configurations on that host. There is no process associated with a gateway role, and its status will always be Stopped. You can configure gateway roles for HBase, HDFS, Hive, MapReduce, Solr, Spark, Sqoop 1 Client, and YARN.

This chapter covers the following topics related to roles and role groups: