Configure YARN ResourceManager high availability

You can use Cloudera Manager to configure YARN ResourceManager high availability (HA).

Cloudera Manager supports automatic failover of the ResourceManager. It does not provide a mechanism to manually force a failover through the Cloudera Manager user interface.

Enable high availability

You can enable YARN ResourceManager High Availability using Cloudera Manager. When you enable ResourceManager HA in Cloudera Manager, work preserving recovery is also enabled for the ResourceManager by default.

  1. In Cloudera Manager, select the YARN service.
  2. Click Actions.
  3. Select Enable High Availability.

    A screen showing the hosts that are eligible to run a standby ResourceManager displays. The host where the current ResourceManager is running is not available as a choice.

  4. Select the host where you want the standby ResourceManager to be installed.
  5. Click Continue.

    Cloudera Manager proceeds to run a set of commands that stop the YARN service, add a standby ResourceManager, initialize the ResourceManager high availability state in ZooKeeper, restart YARN, and redeploy the relevant client configurations.

    ResourceManager HA does not affect the JobHistory Server (JHS). JHS does not maintain any state, so if the host fails you can simply assign it to a new host. If you want to enable process auto-restart, do the following:

  6. In Cloudera Manager, select the YARN service.
  7. Click the Configuration tab.
  8. Search for restart.
  9. Find the Automatically Restart Process property.
  10. Click Edit Individual Values.
  11. Select the JobHistory Server Default Group option.
  12. Click Save Changes.
  13. Restart the JobHistory Server role.

Disable high availability

You can disable YARN ResourceManager High Availability using Cloudera Manager.

  1. In Cloudera Manager, select the YARN service.
  2. Click Actions.
  3. Select Disable High Availability.

    A screen showing the hosts running the ResourceManagers displays.

  4. Select the ResourceManager (host) you want to remain as the single ResourceManager.
  5. Click Continue.

    Cloudera Manager runs a set of commands that stop the YARN service, remove the standby ResourceManager and the Failover Controller, restart the YARN service, and redeploy client configurations.