Configuring Fault Tolerance
Also available as:
loading table of contents...

Deploying the ResourceManager HA Cluster

Update the yarn-site.xml file and configuration files and start Zookeeper, HDFS, and YARN in that order.

  1. Copy the etc/hadoop/conf/yarn-site.xml file from the primary ResourceManager host to the standby ResourceManager host.
  2. Make sure that the clientPort value set in etc/zookeeper/conf/zoo.cfg matches the port set in the following yarn-site.xml property:
  3. Start ZooKeeper. Execute this command on the ZooKeeper host machines:
    su - zookeeper -c "export ZOOCFGDIR=/usr/hdp/current/zookeeper-server/conf ; export ZOOCFG=zoo.cfg; source /usr/hdp/current/zookeeper-server/conf/ ; /usr/hdp/current/zookeeper-server/bin/ start"
  4. Start HDFS.
  5. Start YARN.
  6. Set the active ResourceManager:

    MANUAL FAILOVER ONLY: If you configured manual ResourceManager failover, you must transition one of the ResourceManagers to Active mode. Execute the following CLI command to transition ResourceManager "rm1" to Active:

    yarn rmadmin -transitionToActive rm1

    You can use the following CLI command to transition ResourceManager "rm1" to Standby mode:

    yarn rmadmin -transitionToStandby rm1 

    AUTOMATIC FAILOVER: If you configured automatic ResourceManager failover, no action is required -- the Active ResourceManager will be chosen automatically.

  7. Start all remaining unstarted cluster services.