MapReduce (MRv1) JobTracker High Availability

Follow the instructions in this section to configure high availability (HA) for JobTracker.

Configuring MapReduce (MRv1) JobTracker High Availability Using Cloudera Manager
- Enabling JobTracker High Availability
- Disabling JobTracker High Availability
Configuring MapReduce (MRv1) JobTracker High Availability Using the Command Line
Usage Notes
- Using the JobTracker Web UI
- Turning off Job Recovery

Configuring MapReduce (MRv1) JobTracker High Availability Using Cloudera Manager

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

You can use Cloudera Manager to configure CDH 4.3 or later for JobTracker high availability (HA). Although it is possible to configure JobTracker HA with CDH 4.2, it is not recommended. Rolling restart, decommissioning of TaskTrackers, and rolling upgrade of MapReduce from CDH 4.2 to CDH 4.3 are not supported when JobTracker HA is enabled.

Cloudera Manager supports automatic failover of the JobTracker. It does not provide a mechanism to manually force a failover through the Cloudera Manager user interface.

Enabling JobTracker High Availability

The Enable High Availability workflow leads you through adding a second (standby) JobTracker:

Go to the MapReduce service.
Select Actions > Enable High Availability. A screen showing the hosts that are eligible to run a standby JobTracker displays. The host where the current JobTracker is running is not available as a choice.
Select the host where you want the Standby JobTracker to be installed, and click Continue.
Enter a directory location on the local filesystem for each JobTracker host. These directories will be used to store job configuration data.
- You may enter more than one directory, though it is not required. The paths do not need to be the same on both JobTracker hosts.
- If the directories you specify do not exist, they will be created with the appropriate permissions. If they already exist, they must be empty and have the appropriate permissions.
- If the directories are not empty, Cloudera Manager will not delete the contents.
Optionally use the checkbox under Advanced Options to force initialize the ZooKeeper znode for auto-failover.
Click Continue. Cloudera Manager executes a set of commands that stop the MapReduce service, add a standby JobTracker and Failover controller, initialize the JobTracker high availability state in ZooKeeper, create the job status directory, restart MapReduce, and redeploy the relevant client configurations.

Disabling JobTracker High Availability

Go to the MapReduce service.
Select Actions > Disable High Availability. A screen showing the hosts running the JobTrackers displays.
Select which JobTracker (host) you want to remain as the single JobTracker, and click Continue. Cloudera Manager executes a set of commands that stop the MapReduce service, remove the standby JobTracker and the Failover Controller, restart the MapReduce service, and redeploy client configurations.

Configuring MapReduce (MRv1) JobTracker High Availability Using the Command Line

If you are running MRv1, you can configure the JobTracker to be highly available. You can configure either manual or automatic failover to a warm-standby JobTracker.

To use the high availability feature, you must create a new configuration. This new configuration is designed such that all the hosts in the cluster can have the same configuration; you do not need to deploy different configuration files to different hosts depending on each host's role in the cluster.

In an HA setup, the mapred.job.tracker property is no longer a host:port string, but instead specifies a logical name to identify JobTracker instances in the cluster (active and standby). Each distinct JobTracker in the cluster has a different JobTracker ID. To support a single configuration file for all of the JobTrackers, the relevant configuration parameters are suffixed with the JobTracker logical name as well as the JobTracker ID.

The HA JobTracker is packaged separately from the original (non-HA) JobTracker.

JobTracker HA reuses the mapred.job.tracker parameter in mapred-site.xml to identify a JobTracker active-standby pair. In addition, you must enable the existing mapred.jobtracker.restart.recover, mapred.job.tracker.persist.jobstatus.active, and mapred.job.tracker.persist.jobstatus.hours parameters, as well as a number of new parameters, as discussed below.

Use the sections that follow to install, configure and test JobTracker HA.

Replacing the non-HA JobTracker with the HA JobTracker

This section provides instructions for removing the non-HA JobTracker and installing the HA JobTracker.

Removing the non-HA JobTracker

You must remove the original (non-HA) JobTracker before you install and run the HA JobTracker. First, you need to stop the JobTracker and TaskTrackers.

To stop the JobTracker and TaskTrackers:

Stop the TaskTrackers: On each TaskTracker system:

$ sudo service hadoop-0.20-mapreduce-tasktracker stop

Stop the JobTracker: On the JobTracker system:

$ sudo service hadoop-0.20-mapreduce-jobtracker stop

Verify that the JobTracker and TaskTrackers have stopped:
```
$ ps -eaf | grep -i job
$ ps -eaf | grep -i task
```

To remove the JobTracker:

On Red Hat-compatible systems:

$ sudo yum remove hadoop-0.20-mapreduce-jobtracker

On SLES systems:

$ sudo zypper remove hadoop-0.20-mapreduce-jobtracker

On Ubuntu systems:

sudo apt-get remove hadoop-0.20-mapreduce-jobtracker

Installing the HA JobTracker

Use the following steps to install the HA JobTracker package, and optionally the ZooKeeper failover controller package (needed for automatic failover).

Step 1: Install the HA JobTracker package on two separate nodes

On each JobTracker node:

On Red Hat-compatible systems:

$ sudo yum install hadoop-0.20-mapreduce-jobtrackerha

On SLES systems:

$ sudo zypper install hadoop-0.20-mapreduce-jobtrackerha

On Ubuntu systems:

sudo apt-get install hadoop-0.20-mapreduce-jobtrackerha

Step 2: (Optionally) install the failover controller package

If you intend to enable automatic failover, you need to install the failover controller package.

Install the failover controller package as follows:

On each JobTracker node:

On Red Hat-compatible systems:

$ sudo yum install hadoop-0.20-mapreduce-zkfc

On SLES systems:

$ sudo zypper install hadoop-0.20-mapreduce-zkfc

On Ubuntu systems:

sudo apt-get install hadoop-0.20-mapreduce-zkfc

Configuring and Deploying Manual Failover

Proceed as follows to configure manual failover:

Step 1: Configure the JobTrackers, TaskTrackers, and Clients

Changes to existing configuration parameters

Property name	Default	Used on	Description
`mapred.job.tracker`	`local`	JobTracker, TaskTracker, client	In an HA setup, the logical name of the JobTracker active-standby pair. In a non-HA setup `mapred.job.tracker` is a `host:port` string specifying the JobTracker's RPC address, but in an HA configuration the logical name must not include a port number.
`mapred.jobtracker.restart.` `recover`	`false`	JobTracker	Whether to recover jobs that were running in the most recent active JobTracker. Must be set to `true` for JobTracker HA.
`mapred.job.tracker.persist.` `jobstatus.active`	`false`	JobTracker	Whether to make job status persistent in HDFS. Must be set to `true` for JobTracker HA.
`mapred.job.tracker.persist.` `jobstatus.hours`	`0`	JobTracker	The number of hours job status information is retained in HDFS. Must be greater than zero for JobTracker HA.
`mapred.job.tracker.persist.` `jobstatus.dir`	`/jobtracker/jobsInfo`	JobTracker	The HDFS directory in which job status information is kept persistently. The directory must exist and be owned by the `mapred` user.

New configuration parameters

Property name	Default	Used on	Description
`mapred.jobtrackers.<name>`	None	JobTracker, TaskTracker, client	A comma-separated pair of IDs for the active and standby JobTrackers. The `<name>` is the value of `mapred.job.tracker`.
`mapred.jobtracker.rpc-` `address.<name>.<id>`	None	JobTracker, TaskTracker, client	The RPC address of an individual JobTracker. `<name>` refers to the value of `mapred.job.tracker`; `<id>` refers to one or other of the values in `mapred.jobtrackers.<name>`.
`mapred.job.tracker.http.` `address.<name>.<id>`	None	JobTracker, TaskTracker	The HTTP address of an individual JobTracker. (In a non-HA setup `mapred.job.tracker.http.address` (with no suffix) is the JobTracker's HTTP address.)
`mapred.ha.jobtracker.` `rpc-address.<name>.<id>`	None	JobTracker, failover controller	The RPC address of the HA service protocol for the JobTracker. The JobTracker listens on a separate port for HA operations which is why this property exists in addition to `mapred.jobtracker.rpc-address.<name>.<id>`.
`mapred.ha.jobtracker.` `http-redirect-address.<name>.<id>`	None	JobTracker	The HTTP address of an individual JobTracker that should be used for HTTP redirects. The standby JobTracker will redirect all web traffic to the active, and will use this property to discover the URL to use for redirects. A property separate from `mapred.job.tracker.http.` `address.<name>.<id>` is needed since the latter may be a wildcard bind address, such as `0.0.0.0:50030`, which is not suitable for making requests. Note also that `mapred.ha.jobtracker.http-redirect-address.<name>.<id>` is the HTTP redirect address for the JobTracker with ID `<id>` for the pair with the logical name `<name>` - that is, the address that should be used when that JobTracker is active, and not the address that should be redirected to when that JobTracker is the standby.
`mapred.ha.jobtracker.id`	None	JobTracker	The identity of this JobTracker instance. Note that this is optional since each JobTracker can infer its ID from the matching address in one of the `mapred.jobtracker.rpc-address.` `<name>.<id>` properties. It is provided for testing purposes.
`mapred.client.failover.` `proxy.provider.<name>`	None	TaskTracker, client	The failover provider class. The only class available is `org.apache.hadoop.mapred.` `ConfiguredFailoverProxyProvider`.
`mapred.client.failover.` `max.attempts`	15	TaskTracker, client	The maximum number of times to try to fail over.
`mapred.client.failover.` `sleep.base.millis`	500	TaskTracker, client	The time to wait before the first failover.
`mapred.client.failover.` `sleep.max.millis`	1500	TaskTracker, client	The maximum amount of time to wait between failovers (for exponential backoff).
`mapred.client.failover.` `connection.retries`	0	TaskTracker, client	The maximum number of times to retry between failovers.
`mapred.client.failover.` `connection.retries.on.` `timeouts`	0	TaskTracker, client	The maximum number of times to retry on timeouts between failovers.
`mapred.ha.fencing.methods`	None	failover controller	A list of scripts or Java classes that will be used to fence the active JobTracker during failover. Only one JobTracker should be active at any given time, but you can simply configure `mapred.ha.fencing.methods` as `shell(/bin/true)` since the JobTrackers fence themselves, and split-brain is avoided by the old active JobTracker shutting itself down if another JobTracker takes over.

Make changes and additions similar to the following to mapred-site.xml on each node.

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

  <property>
    <name>mapred.job.tracker</name>
    <value>logicaljt</value> 
    <!-- host:port string is replaced with a logical name -->
  </property>

  <property>
    <name>mapred.jobtrackers.logicaljt</name>
    <value>jt1,jt2</value>
    <description>Comma-separated list of JobTracker IDs.</description>
  </property>

  <property>
    <name>mapred.jobtracker.rpc-address.logicaljt.jt1</name> 
    <!-- RPC address for jt1 -->
    <value>myjt1.myco.com:8021</value>
  </property>

  <property>
    <name>mapred.jobtracker.rpc-address.logicaljt.jt2</name> 
    <!-- RPC address for jt2 -->
    <value>myjt2.myco.com:8022</value>
  </property>

 <property>
    <name>mapred.job.tracker.http.address.logicaljt.jt1</name> 
    <!-- HTTP bind address for jt1 -->
    <value>0.0.0.0:50030</value>
  </property>

  <property>
    <name>mapred.job.tracker.http.address.logicaljt.jt2</name> 
    <!-- HTTP bind address for jt2 -->
    <value>0.0.0.0:50031</value>
  </property>

  <property>
    <name>mapred.ha.jobtracker.rpc-address.logicaljt.jt1</name> 
    <!-- RPC address for jt1 HA daemon -->
    <value>myjt1.myco.com:8023</value>
  </property>

  <property>
    <name>mapred.ha.jobtracker.rpc-address.logicaljt.jt2</name> 
    <!-- RPC address for jt2 HA daemon -->
    <value>myjt2.myco.com:8024</value>
  </property>


  <property>
    <name>mapred.ha.jobtracker.http-redirect-address.logicaljt.jt1</name> 
    <!-- HTTP redirect address for jt1 -->
    <value>myjt1.myco.com:50030</value>
  </property>

  <property>
    <name>mapred.ha.jobtracker.http-redirect-address.logicaljt.jt2</name> 
    <!-- HTTP redirect address for jt2 -->
    <value>myjt2.myco.com:50031</value>
  </property>

 <property>
    <name>mapred.jobtracker.restart.recover</name>
    <value>true</value>
  </property>

  <property>
    <name>mapred.job.tracker.persist.jobstatus.active</name>
    <value>true</value>
  </property>

 <property>
    <name>mapred.job.tracker.persist.jobstatus.hours</name>
    <value>1</value>
  </property>

  <property>
    <name>mapred.job.tracker.persist.jobstatus.dir</name>
    <value>/jobtracker/jobsInfo</value>
  </property>

  <property>
    <name>mapred.client.failover.proxy.provider.logicaljt</name>
    <value>org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider</value>
  </property>

  <property>
    <name>mapred.client.failover.max.attempts</name>
    <value>15</value>
  </property>

  <property>
    <name>mapred.client.failover.sleep.base.millis</name>
    <value>500</value>
  </property>

  <property>
    <name>mapred.client.failover.sleep.max.millis</name>
    <value>1500</value>  
  </property>

 <property>
    <name>mapred.client.failover.connection.retries</name>
    <value>0</value>  
  </property>

 <property>
    <name>mapred.client.failover.connection.retries.on.timeouts</name>
    <value>0</value>  
  </property>
 <property>
    <name>mapred.ha.fencing.methods</name>
    <value>shell(/bin/true)</value>
 </property>
</configuration>

Step 2: Start the JobTracker daemons

To start the daemons, run the following command on each JobTracker node:

$ sudo service hadoop-0.20-mapreduce-jobtrackerha start

Step 3: Activate a JobTracker

Unless automatic failover is configured, both JobTrackers will be in a standby state after the jobtrackerha daemons start up.

If Kerberos is not enabled, use the following commands:

To find out what state each JobTracker is in:

$ sudo -u mapred hadoop mrhaadmin -getServiceState <id>

where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.

To transition one of the JobTrackers to active and then verify that it is active:

$ sudo -u mapred hadoop mrhaadmin -transitionToActive <id>
$ sudo -u mapred hadoop mrhaadmin -getServiceState <id>

where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.

With Kerberos enabled, log in as the mapred user and use the following commands:

To log in as the mapred user and kinit:

$ sudo su - mapred
$ kinit -kt mapred.keytab mapred/<fully.qualified.domain.name>

To find out what state each JobTracker is in:

$ hadoop mrhaadmin -getServiceState <id>

where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.

To transition one of the JobTrackers to active and then verify that it is active:

$ hadoop mrhaadmin -transitionToActive <id>
$ hadoop mrhaadmin -getServiceState <id>

where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.

Step 4: Verify that failover is working

Use the following commands, depending whether or not Kerberos is enabled.

If Kerberos is not enabled, use the following commands:

To cause a failover from the currently active to the currently inactive JobTracker:

$ sudo -u mapred hadoop mrhaadmin -failover <id_of_active_JobTracker> <id_of_inactive_JobTracker>

For example, if jt1 is currently active:

$ sudo -u mapred hadoop mrhaadmin -failover jt1 jt2

To verify the failover:

$ sudo -u mapred hadoop mrhaadmin -getServiceState <id>

For example, if jt2 should now be active:

$ sudo -u mapred hadoop mrhaadmin -getServiceState jt2

With Kerberos enabled, use the following commands:

To log in as the mapred user and kinit:

$ sudo su - mapred
$ kinit -kt mapred.keytab mapred/<fully.qualified.domain.name>

To cause a failover from the currently active to the currently inactive JobTracker:

$ hadoop mrhaadmin -failover <id_of_active_JobTracker> <id_of_inactive_JobTracker>

For example, if jt1 is currently active:

$ hadoop mrhaadmin -failover jt1 jt2

To verify the failover:

$ hadoop mrhaadmin -getServiceState <id>

For example, if jt2 should now be active:

$ hadoop mrhaadmin -getServiceState jt2

Configuring and Deploying Automatic Failover

To configure automatic failover, proceed as follows:

Step 1: Configure a ZooKeeper ensemble (if necessary)

To support automatic failover you need to set up a ZooKeeper ensemble running on three or more nodes, and verify its correct operation by connecting using the ZooKeeper command-line interface (CLI). See the ZooKeeper documentation for instructions on how to set up a ZooKeeper ensemble.

Step 2: Configure the parameters for manual failover

See the instructions for configuring the TaskTrackers and JobTrackers under Configuring and Deploying Manual Failover.

Step 3: Configure failover controller parameters

Use the following additional parameters to configure a failover controller for each JobTracker. The failover controller daemons run on the JobTracker nodes.

New configuration parameters

Property name	Default	Configure on	Description
`mapred.ha.automatic-failover.enabled`	`false`	failover controller	Set to `true` to enable automatic failover.
`mapred.ha.zkfc.port`	8019	failover controller	The ZooKeeper failover controller port.
`ha.zookeeper.quorum`	None	failover controller	The ZooKeeper quorum (ensemble) to use for MRZKFailoverController.

Add the following configuration information to mapred-site.xml:

  <property>
    <name>mapred.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>

  <property>
    <name>mapred.ha.zkfc.port</name>
    <value>8018</value> 
    <!-- Pick a different port for each failover controller when running one machine -->
  </property>

Add an entry similar to the following to core-site.xml:

<property>
    <name>ha.zookeeper.quorum</name>
    <value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181 </value> 
    <!-- ZK ensemble addresses -->
</property>

Step 4: Initialize the HA State in ZooKeeper

After you have configured the failover controllers, the next step is to initialize the required state in ZooKeeper. You can do so by running one of the following commands from one of the JobTracker nodes.

Note: The ZooKeeper ensemble must be running when you use this command; otherwise it will not work properly.

$ sudo service hadoop-0.20-mapreduce-zkfc init

$ sudo -u mapred hadoop mrzkfc -formatZK

This will create a znode in ZooKeeper in which the automatic failover system stores its data.

Step 5: Enable automatic failover

To enable automatic failover once you have completed the configuration steps, you need only start the jobtrackerha and zkfc daemons.

To start the daemons, run the following commands on each JobTracker node:

$ sudo service hadoop-0.20-mapreduce-zkfc start
$ sudo service hadoop-0.20-mapreduce-jobtrackerha start

One of the JobTrackers will automatically transition to active.

Step 6: Verify automatic failover

After enabling automatic failover, you should test its operation. To do so, first locate the active JobTracker. To find out what state each JobTracker is in, use the following command:

$ sudo -u mapred hadoop mrhaadmin -getServiceState <id>

where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.

Once you have located your active JobTracker, you can cause a failure on that node. For example, you can use kill -9 <pid of JobTracker> to simulate a JVM crash. Or you can power-cycle the machine or its network interface to simulate different kinds of outages. After you trigger the outage you want to test, the other JobTracker should automatically become active within several seconds. The amount of time required to detect a failure and trigger a failover depends on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.

If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc and jobtrackerha daemons to diagnose the problem.

Work Preserving Recovery for YARN Components

Usage Notes