Configuring JobTracker High Availability
Follow the instructions in this section to configure high availability (HA) for the JobTracker.
JobTracker HA reuses the mapred.job.tracker parameter in mapred-site.xml to identify a JobTracker active-standby pair. In addition, you must enable the existing mapred.jobtracker.restart.recover, mapred.job.tracker.persist.jobstatus.active, and mapred.job.tracker.persist.jobstatus.hours parameters, as well as a number of new parameters, as discussed below.
Configuring and Deploying Manual Failover
Proceed as follows to configure manual failover:
- Configure the JobTrackers, TaskTrackers, and Clients
- Start the JobTrackers
- Activate a JobTracker
- Verify that failover is working
Step 1: Configure the JobTrackers, TaskTrackers, and clients
Changes to existing configuration parameters
Property name |
Default |
Used on |
Description |
---|---|---|---|
mapred.job.tracker |
local |
JobTracker, TaskTracker, client |
In an HA setup, the logical name of the JobTracker active-standby pair. In a non-HA setup mapred.job.tracker is a host:port string specifying the JobTracker's RPC address, but in an HA configuration the logical name must not include a port number. |
mapred.jobtracker.restart. recover |
false |
JobTracker |
Whether to recover jobs that were running in the most recent active JobTracker. Must be set to true for JobTracker HA. |
mapred.job.tracker.persist. jobstatus.active |
false |
JobTracker |
Whether to make job status persistent in HDFS. Must be set to true for JobTracker HA. |
mapred.job.tracker.persist. jobstatus.hours |
0 |
JobTracker |
The number of hours job status information is retained in HDFS. Must be greater than zero for JobTracker HA. |
mapred.job.tracker.persist. jobstatus.dir |
/jobtracker/jobsInfo |
JobTracker |
The HDFS directory in which job status information is kept persistently. The directory must exist and be owned by the mapred user. |
New configuration parameters
Property name |
Default |
Used on |
Description |
---|---|---|---|
mapred.jobtrackers.<name> |
None |
JobTracker, TaskTracker, client |
A comma-separated pair of IDs for the active and standby jobtrackers. The <name> is the value of mapred.job.tracker. |
mapred.jobtracker.rpc- address.<name>.<id> |
None |
JobTracker, TaskTracker, client |
The RPC address of an individual JobTracker. <name> refers to the value of mapred.job.tracker; <id> refers to one or other of the values in mapred.jobtrackers.<name>. |
mapred.job.tracker.http. address.<name>.<id> |
None |
JobTracker, TaskTracker |
The HTTP address of an individual JobTracker. (In a non-HA setup mapred.job.tracker.http.address (with no suffix) is the JobTracker's HTTP address.) |
mapred.ha.jobtracker. rpc-address.<name>.<id> |
None |
JobTracker, failover controller |
The RPC address of the HA service protocol for the JobTracker. The JobTracker listens on a separate port for HA operations which is why this property exists in addition to mapred.jobtracker.rpc-address.<name>.<id>. |
mapred.ha.jobtracker. http-redirect-address.<name>.<id> |
None |
JobTracker |
The HTTP address of an individual JobTracker that should be used for HTTP redirects. The standby JobTracker will redirect all web traffic to the active, and will use this property to discover the URL to use for redirects. A property separate from mapred.job.tracker.http. address.<name>.<id> is needed since the latter may be a wildcard bind address, such as 0.0.0.0:50030, which is not suitable for making requests. Note also that mapred.ha.jobtracker.http-redirect-address.<name>.<id> is the HTTP redirect address for the JobTracker with ID <id> for the pair with the logical name <name> - that is, the address that should be used when that JobTracker is active, and not the address that should be redirected to when that JobTracker is the standby. |
mapred.ha.jobtracker.id |
None |
JobTracker |
The identity of this JobTracker instance. Note that this is optional since each JobTracker can infer its ID from the matching address in one of the mapred.jobtracker.rpc-address. <name>.<id> properties. It is provided for testing purposes. |
mapred.client.failover. proxy.provider.<name> |
None |
TaskTracker, client |
The failover provider class. The only class available is org.apache.hadoop.mapred. ConfiguredFailoverProxyProvider. |
mapred.client.failover. max.attempts |
15 |
TaskTracker, client |
The maximum number of times to try to fail over. |
mapred.client.failover. sleep.base.millis |
500 |
TaskTracker, client |
The time to wait before the first failover. |
mapred.client.failover. sleep.max.millis |
1500 |
TaskTracker, client |
The maximum amount of time to wait between failovers (for exponential backoff). |
mapred.client.failover. connection.retries |
0 |
TaskTracker, client |
The maximum number of times to retry between failovers. |
mapred.client.failover. connection.retries.on. timeouts |
0 |
TaskTracker, client |
The maximum number of times to retry on timeouts between failovers. |
mapred.ha.fencing.methods |
None |
failover controller |
A list of scripts or Java classes that will be used to fence the active JobTracker during failover. Only one JobTracker should be active at any given time, but you can simply configure mapred.ha.fencing.methods as shell(/bin/true) since the JobTrackers fence themselves, and split-brain is avoided by the old active JobTracker shutting itself down if another JobTracker takes over. |
Make changes and additions similar to the following to mapred-site.xml on each node.
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>logicaljt</value> <!-- host:port string is replaced with a logical name --> </property> <property> <name>mapred.jobtrackers.logicaljt</name> <value>jt1,jt2</value> <description>Comma-separated list of JobTracker IDs.</description> </property> <property> <name>mapred.jobtracker.rpc-address.logicaljt.jt1</name> <!-- RPC address for jt1 --> <value>myjt1.myco.com:8021</value> </property> <property> <name>mapred.jobtracker.rpc-address.logicaljt.jt2</name> <!-- RPC address for jt2 --> <value>myjt2.myco.com:8022</value> </property> <property> <name>mapred.job.tracker.http.address.logicaljt.jt1</name> <!-- HTTP bind address for jt1 --> <value>0.0.0.0:50030</value> </property> <property> <name>mapred.job.tracker.http.address.logicaljt.jt2</name> <!-- HTTP bind address for jt2 --> <value>0.0.0.0:50031</value> </property> <property> <name>mapred.ha.jobtracker.rpc-address.logicaljt.jt1</name> <!-- RPC address for jt1 HA daemon --> <value>myjt1.myco.com:8023</value> </property> <property> <name>mapred.ha.jobtracker.rpc-address.logicaljt.jt2</name> <!-- RPC address for jt2 HA daemon --> <value>myjt2.myco.com:8024</value> </property> <property> <name>mapred.ha.jobtracker.http-redirect-address.logicaljt.jt1</name> <!-- HTTP redirect address for jt1 --> <value>myjt1.myco.com:50030</value> </property> <property> <name>mapred.ha.jobtracker.http-redirect-address.logicaljt.jt2</name> <!-- HTTP redirect address for jt2 --> <value>myjt2.myco.com:50031</value> </property> <property> <name>mapred.jobtracker.restart.recover</name> <value>true</value> </property> <property> <name>mapred.job.tracker.persist.jobstatus.active</name> <value>true</value> </property> <property> <name>mapred.job.tracker.persist.jobstatus.hours</name> <value>1</value> </property> <property> <name>mapred.job.tracker.persist.jobstatus.dir</name> <value>/jobtracker/jobsInfo</value> </property> <property> <name>mapred.client.failover.proxy.provider.logicaljt</name> <value>org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider</value> </property> <property> <name>mapred.client.failover.max.attempts</name> <value>15</value> </property> <property> <name>mapred.client.failover.sleep.base.millis</name> <value>500</value> </property> <property> <name>mapred.client.failover.sleep.max.millis</name> <value>1500</value> </property> <property> <name>mapred.client.failover.connection.retries</name> <value>0</value> </property> <property> <name>mapred.client.failover.connection.retries.on.timeouts</name> <value>0</value> </property> <property> <name>mapred.ha.fencing.methods</name> <value>shell(/bin/true)</value> </property> </configuration>
In pseudo-distributed mode you need to specify mapred.ha.jobtracker.id for each JobTracker, so that the JobTracker knows its identity.
But in a fully-distributed setup, where the JobTrackers run on different nodes, there is no need to set mapred.ha.jobtracker.id, since the JobTracker can infer the ID from the matching address in one of the mapred.jobtracker.rpc-address.<name>.<id> properties.
Step 2: Start the JobTracker daemons
To start the daemons, run the following command on each JobTracker node:
$ sudo service hadoop-0.20-mapreduce-jobtrackerha start
Step 3: Activate a JobTracker
- You must be the mapred user to use mrhaadmin commands.
- If Kerberos is enabled, do not use sudo -u mapred when using the hadoop mrhaadmin command. Instead, you must log in with the mapred Kerberos credentials (the short name must be mapred). See Configuring Hadoop Security in CDH 5 for more information.
Unless automatic failover is configured, both JobTrackers will be in a standby state after the jobtrackerha daemons start up.
If Kerberos is not enabled, use the following commands:
To find out what state each JobTracker is in:
$ sudo -u mapred hadoop mrhaadmin -getServiceState <id>
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.
To transition one of the JobTrackers to active and then verify that it is active:
$ sudo -u mapred hadoop mrhaadmin -transitionToActive <id> $ sudo -u mapred hadoop mrhaadmin -getServiceState <id>
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.
With Kerberos enabled, log in as the mapred user and use the following commands:
To log in as the mapred user and kinit:
$ sudo su - mapred $ kinit -kt mapred.keytab mapred/<fully.qualified.domain.name>
To find out what state each JobTracker is in:
$ hadoop mrhaadmin -getServiceState <id>
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.
To transition one of the JobTrackers to active and then verify that it is active:
$ hadoop mrhaadmin -transitionToActive <id> $ hadoop mrhaadmin -getServiceState <id>
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.
Step 4: Verify that failover is working
Use the following commands, depending whether or not Kerberos is enabled.
If Kerberos is not enabled, use the following commands:
To cause a failover from the currently active to the currently inactive JobTracker:
$ sudo -u mapred hadoop mrhaadmin -failover <id_of_active_JobTracker> <id_of_inactive_JobTracker>
For example, if jt1 is currently active:
$ sudo -u mapred hadoop mrhaadmin -failover jt1 jt2
To verify the failover:
$ sudo -u mapred hadoop mrhaadmin -getServiceState <id>
For example, if jt2 should now be active:
$ sudo -u mapred hadoop mrhaadmin -getServiceState jt2
With Kerberos enabled, use the following commands:
To log in as the mapred user and kinit:
$ sudo su - mapred $ kinit -kt mapred.keytab mapred/<fully.qualified.domain.name>
To cause a failover from the currently active to the currently inactive JobTracker:
$ hadoop mrhaadmin -failover <id_of_active_JobTracker> <id_of_inactive_JobTracker>
For example, if jt1 is currently active:
$ hadoop mrhaadmin -failover jt1 jt2
To verify the failover:
$ hadoop mrhaadmin -getServiceState <id>
For example, if jt2 should now be active:
$ hadoop mrhaadmin -getServiceState jt2
Configuring and Deploying Automatic Failover
To configure automatic failover, proceed as follows:
- Configure a ZooKeeper ensemble (if necessary)
- Configure parameters for manual failover
- Configure failover controller parameters
- Initialize the HA state in ZooKeeper
- Enable automatic failover
- Verify automatic failover
Step 1: Configure a ZooKeeper ensemble (if necessary)
To support automatic failover you need to set up a ZooKeeper ensemble running on three or more nodes, and verify its correct operation by connecting using the ZooKeeper command-line interface (CLI). See the ZooKeeper documentation for instructions on how to set up a ZooKeeper ensemble.
If you are already using a ZooKeeper ensemble for automatic NameNode failover, use the same ensemble for automatic JobTracker failover.
Step 2: Configure the parameters for manual failover
See the instructions for configuring the TaskTrackers and JobTrackers under Configuring and Deploying Manual Failover.
Step 3: Configure failover controller parameters
Use the following additional parameters to configure a failover controller for each JobTracker. The failover controller daemons run on the JobTracker nodes.
New configuration parameters
Property name |
Default |
Configure on |
Description |
---|---|---|---|
mapred.ha.automatic-failover.enabled |
false |
failover controller |
Set to true to enable automatic failover. |
mapred.ha.zkfc.port |
8019 |
failover controller |
The ZooKeeper failover controller port. |
ha.zookeeper.quorum |
None |
failover controller |
The ZooKeeper quorum (ensemble) to use for MRZKFailoverController. |
Add the following configuration information to mapred-site.xml:
<property> <name>mapred.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>mapred.ha.zkfc.port</name> <value>8018</value> <!-- Pick a different port for each failover controller when running one machine --> </property>
Add an entry similar to the following to core-site.xml:
<property> <name>ha.zookeeper.quorum</name> <value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181 </value> <!-- ZK ensemble addresses --> </property>
Step 4: Initialize the HA State in ZooKeeper
After you have configured the failover controllers, the next step is to initialize the required state in ZooKeeper. You can do so by running one of the following commands from one of the JobTracker nodes.
The ZooKeeper ensemble must be running when you use this command; otherwise it will not work properly.
$ sudo service hadoop-0.20-mapreduce-zkfc init
or
$ sudo -u mapred hadoop mrzkfc -formatZK
This will create a znode in ZooKeeper in which the automatic failover system stores its data.
If you are running a secure cluster, see also Securing access to ZooKeeper.
Step 5: Enable automatic failover
To enable automatic failover once you have completed the configuration steps, you need only start the jobtrackerha and zkfc daemons.
To start the daemons, run the following commands on each JobTracker node:
$ sudo service hadoop-0.20-mapreduce-zkfc start $ sudo service hadoop-0.20-mapreduce-jobtrackerha start
One of the JobTrackers will automatically transition to active.
Step 6: Verify automatic failover
After enabling automatic failover, you should test its operation. To do so, first locate the active JobTracker. To find out what state each JobTracker is in, use the following command:
You must be the mapred user to use mrhaadmin commands.
$ sudo -u mapred hadoop mrhaadmin -getServiceState <id>
where <id> is one of the values you configured in the mapred.jobtrackers.<name> property – jt1 or jt2 in our sample mapred-site.xml files.
Once you have located your active JobTracker, you can cause a failure on that node. For example, you can use kill -9 <pid of JobTracker> to simulate a JVM crash. Or you can power-cycle the machine or its network interface to simulate different kinds of outages. After you trigger the outage you want to test, the other JobTracker should automatically become active within several seconds. The amount of time required to detect a failure and trigger a failover depends on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.
If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc and jobtrackerha daemons to diagnose the problem.
<< Replacing the non-HA JobTracker with the HA JobTracker | Usage Notes >> | |