This is the documentation for Cloudera Manager 5.0.x. Documentation for other versions is available at Cloudera Documentation.

The YARN Service

CDH supports two versions of the MapReduce computation framework: MRv1 and MRv2, which are implemented by the MapReduce (MRv1) and YARN (MRv2) services.

In a CDH 4 cluster, the MapReduce service is the default MapReduce computation framework.
  Important: You can create a YARN service in a CDH 4 cluster, but it is not considered production ready.
In a CDH 5 cluster, the YARN service is the default MapReduce computation framework.
  Important: In CDH 5 the MapReduce service has been deprecated. However, the MapReduce service is fully supported for backward compatibility through the CDH 5 life cycle.

Cloudera Manager provides a wizard to easily migrate MapReduce configurations to YARN. For further information on migrating from MapReduce to YARN, see Importing MapReduce Configurations to YARN and Migrating from MapReduce v1 (MRv1) to MapReduce v2 (MRv2, YARN).

Configuring Alternatives Priority

The alternatives priority property determines which service—MapReduce or YARN—is used by clients to run MapReduce jobs; the service with a higher value of the property is used. In CDH 4, the MapReduce service alternatives priority is set to 92 and the YARN service is set to 91. In CDH 5, the values are reversed; the MapReduce service alternatives priority is set to 91 and the YARN service is set to 92.

You configure the alternatives priority as follows:
  1. Go to the MapReduce or YARN service.
  2. Select Configuration > View and Edit.
  3. Expand the Gateway Default Group node.
  4. In the Alternatives Priority property, set the priority value.
  5. Click Save Changes.
  6. Redeploy the client configuration.

Adding the YARN Service

  1. On the Home page, click to the right of the cluster name and select Add a Service. A list of service types display. You can add one type of service at a time.
  2. Click the YARN (MR2 Included) radio button and click Continue.
  3. Select the radio button next to the services on which the new service should depend and click Continue.
  4. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to which the HDFS DataNode role is assigned. These assignments are typically acceptable, but you can reassign services to hosts of your choosing, if desired.

    Click a field below a role to display a dialog containing a pageable list of hosts. If you click a field containing multiple hosts, you can also select All Hosts to assign the role to all hosts or Custom to display the pageable hosts dialog.

    The following shortcuts for specifying host names are supported:
    • Range of hostnames (without the domain portion)
      Range Definition Matching Hosts
      10.1.1.[1-4] 10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4
      host[1-3].company.com host1.company.com, host2.company.com, host3.company.com
      host[07-10].company.com host07.company.com, host08.company.com, host09.company.com, host10.company.com
    • IP addresses
    • Rack name

    Click the View By Host button for an overview of the role assignment by host ranges.

Configuring Directories

Creating the Job History Directory

When adding the YARN service, the Add Service wizard automatically creates a job history directory. If you quit the Add Service wizard or it does not finish, you can create the directory outside the wizard by doing these steps:
  1. Go to the YARN service.
  2. Select Actions > Create Job History Dir.
  3. Click Create Job History Dir again to confirm.

Creating the NodeManager Remote Application Log Directory

When adding the YARN service, the Add Service wizard automatically creates a remote application log directory. If you quit the Add Service wizard or it does not finish, you can create the directory outside the wizard by doing these steps:
  1. Go to the YARN service.
  2. Select Actions > Create NodeManager Remote Application Log Directory.
  3. Click Create NodeManager Remote Application Log Directory again to confirm.

Importing MapReduce Configurations to YARN

When you upgrade from CDH 4 to CDH 5, you can import MapReduce configurations to YARN as part of the upgrade wizard. If you did not do so at that time, you can manually import the configurations as follows:
  1. Go to the YARN service page.
  2. Stop the YARN service if it is running.
  3. Select Actions > Import MapReduce Configuration. The import wizard presents a warning letting you know that it will import your configuration, restart the YARN service and its dependent services, and update the client configuration.
  4. Click Continue to proceed.
  5. The next page indicates some additional configuration required by YARN. Verify or modify these and click Continue.
  6. The Switch Cluster to MR2 step proceeds. When all steps have been completed, click Continue.
  Warning: In addition to importing configuration settings, the import process:
  • Configures services to use YARN as the MapReduce computation framework instead of MapReduce.
  • Overwrites existing YARN configuration and role assignments.

Dynamic Resource Management

In addition to the static resource management available to all services, the YARN service also supports dynamic management of its static allocation. See Dynamic Resource Pools.

Configuring YARN High Availability

You can use Cloudera Manager to configure CDH 5 or later for ResourceManager High Availability (HA). A ResourceManager HA cluster is configured with an active and a standby ResourceManager. Only one ResourceManager can be active at any point in time.

Cloudera Manager supports automatic failover of the ResourceManager. It does not provide a mechanism to manually force a failover through the Cloudera Manager user interface.

ResourceManager HA requires ZooKeeper and HDFS services to be running.

For more information, see the Configuring High Availability for ResourceManager in the CDH High Availability Guide.

  Important: Enabling or disabling HA will cause the previous monitoring history to become unavailable.

Enabling High Availability

  1. Go to the YARN service.
  2. Select Actions > Enable High Availability. A screen showing the hosts that are eligible to run a standby ResourceManager displays. The host where the current ResourceManager is running is not available as a choice.
  3. Select the host where you want the standby ResourceManager to be installed, and click Continue. Cloudera Manager proceeds to execute the set of commands that stop the YARN service, add a standby ResourceManager, initialize the ResourceManager High Availability state in ZooKeeper, restart YARN, and redeploy the relevant client configurations.
  Note: ResourceManager HA doesn't affect the JobHistory Server (JHS). JHS doesn't maintain any state, so if the host fails you can simply assign it to a new host. You can also enable process auto-restart by doing the following:
  1. Go to the YARN service.
  2. Select Configuration > View and Edit.
  3. Expand the JobHistory Server Default Group.
  4. Select the Advanced subcategory.
  5. Check the Automatically Restart Process checkbox.
  6. Restart the JobHistory Server role.

Disabling High Availability

  1. Go to the YARN service.
  2. Select Actions > Disable High Availability. A screen showing the hosts running the ResourceManagers displays.
  3. Select which ResourceManager (host) you want to remain as the single ResourceManager, and click Continue. Cloudera Manager executes a set of commands that stop the YARN service, remove the standby ResourceManager and the Failover Controller, restart the YARN service, and redeploy client configurations.
Page generated September 3, 2015.