This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Administering an HDFS High Availability Cluster

Using the haadmin command

Now that your HA NameNodes are configured and started, you will have access to some additional commands to administer your HA HDFS cluster. Specifically, you should familiarize yourself with the subcommands of the hdfs haadmin command.

This page describes high-level uses of some important subcommands. For specific usage information of each subcommand, you should run hdfs haadmin -help <command>.

failover - initiate a failover between two NameNodes

This subcommand causes a failover from the first provided NameNode to the second. If the first NameNode is in the Standby state, this command simply transitions the second to the Active state without error. If the first NameNode is in the Active state, an attempt will be made to gracefully transition it to the Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be attempted in order until one of the methods succeeds. Only after this process will the second NameNode be transitioned to the Active state. If no fencing method succeeds, the second NameNode will not be transitioned to the Active state, and an error will be returned.
  Note:

Running hdfs haadmin -failover from the command line works whether you have configured HA from the command line (as described in this document) or via Cloudera Manager. This means you can initiate a failover manually even if Cloudera Manager is unavailable.

getServiceState

getServiceState - determine whether the given NameNode is Active or Standby

Connect to the provided NameNode to determine its current state, printing either "standby" or "active" to STDOUT as appropriate. This subcommand might be used by cron jobs or monitoring scripts which need to behave differently based on whether the NameNode is currently Active or Standby.

checkHealth

checkHealth - check the health of the given NameNode

Connect to the provided NameNode to check its health. The NameNode is capable of performing some diagnostics on itself, including checking if internal services are running as expected. This command will return 0 if the NameNode is healthy, non-zero otherwise. One might use this command for monitoring purposes.

  Note:

The checkHealth command is not yet implemented, and at present will always return success, unless the given NameNode is completely down.

Using the dfsadmin command when HA is enabled

When you use the dfsadmin command with HA enabled, you should use the -fs option to specify a particular NameNode using the RPC address, or service RPC address, of the NameNode. Not all operations are permitted on a standby NameNode. If the specific NameNode is left unspecified, only the operations to set quotas (-setQuota, -clrQuota, -setSpaceQuota, -clrSpaceQuota), report basic file system information (-report), and check upgrade progress (-upgradeProgress) will failover and perform the requested operation on the active NameNode. The "refresh" options (-refreshNodes, -refreshServiceAcl, -refreshUserToGroupsMappings, and -refreshSuperUserGroupsConfiguration) must be run on both the active and standby NameNodes.

Moving an HA NameNode to a New Host

Use the following steps to move one of the NameNodes to a new host.

In this example, the current NameNodes are called nn1 and nn2, and the new NameNode is nn2-alt. The example assumes that nn2-alt is already a member of this CDH 5 HA cluster, that automatic failover is configured and that a JournalNode on nn2 is to be moved to nn2-alt, in addition to NameNode service itself.

The procedure moves the NameNode and JournalNode services from nn2 to nn2-alt, reconfigures nn1 to recognize the new location of the JournalNode, and restarts nn1 and nn2-alt in the new HA configuration.

Step 1: Make sure that nn1 is the active NameNode

Make sure that the NameNode that is not going to be moved is active; in this example, nn1 must be active. You can use the NameNodes' web UIs to see which is active; see Start the NameNodes.

If nn1 is not the active NameNode, use the hdfs haadmin -failover command to initiate a failover from nn2 to nn1:
hdfs haadmin -failover nn2 nnhdfs haadmin -failover nn2 nn1

Step 2: Stop services on nn2

Once you've made sure that the node to be moved is inactive, stop services on that node: in this example, stop services on nn2. Stop the NameNode, the ZKFC daemon if this an automatic-failover deployment, and the JournalNode if you are moving it. Proceed as follows.
  1. Stop the NameNode daemon:
    $ sudo service hadoop-hdfs-namenode stop
  2. Stop the ZKFC daemon if it is running:
    $ sudo service hadoop-hdfs-zkfc stop
  3. Stop the JournalNode daemon if it is running:
    $ sudo service hadoop-hdfs-journalnode stop 
  4. Make sure these services are not set to restart on boot. If you are not planning to use nn2 as a NameNode again, you may want remove the services.

Step 3: Install the NameNode daemon on nn2-alt

See the instructions for installing hadoop-hdfs-namenode in the CDH 5 Installation Guide under Step 3: Install CDH 5 with YARN or Step 4: Install CDH 5 with MRv1.

Step 4: Configure HA on nn2-alt

See Configuring Software for HDFS HA for the properties to configure on nn2-alt in core-site.xml and hdfs-site.xml, and explanations and instructions. You should copy the values that are already set in the corresponding files on nn2.
  • If you are relocating a JournalNode to nn2-alt, follow these directions to install it, but don't start it yet.
  • If you are using automatic failover, make sure you follow the instructions for configuring the necessary properties on nn2-alt and initializing the HA state in Zookeeper.
      Note:

    You do not need to shut down the cluster to do this if automatic failover is already configured as your failover method; shutdown is required only if you are switching from manual to automatic failover.

Step 5: Copy the contents of the dfs.name.dir and dfs.journalnode.edits.dir directories to nn2-alt

Use rsync or a similar tool to copy the contents of the dfs.name.dir directory, and the dfs.journalnode.edits.dir directory if you are moving the JournalNode, from nn2 to nn2-alt.

Step 6: If you are moving a JournalNode, update dfs.namenode.shared.edits.dir on nn1

If you are relocating a JournalNode from nn2 to nn2-alt, update dfs.namenode.shared.edits.dir in hdfs-site.xml on nn1 to reflect the new hostname. See tthis section for more information about dfs.namenode.shared.edits.dir.

Step 7: If you are using automatic failover, install the zkfc daemon on nn2-alt

For instructions, see Deploy Automatic Failover (if it is configured), but do not start the daemon yet.

Step 8: Start services on nn2-alt

Start the NameNode; start the ZKFC for automatic failover; and install and start a JournalNode if you want one to run on nn2-alt. Proceed as follows.

  1. Start the JournalNode daemon:
    $ sudo service hadoop-hdfs-journalnode start 
  2. Start the NameNode daemon:
    $ sudo service hadoop-hdfs-namenode start
  3. Start the ZKFC daemon:
    $ sudo service hadoop-hdfs-zkfc start
  4. Set these services to restart on boot; for example on a RHEL-compatible system:
    $ sudo chkconfig hadoop-hdfs-namenode on
    $ sudo chkconfig hadoop-hdfs-zkfc on
    $ sudo chkconfig hadoop-hdfs-journalnode on

Step 9: If you are relocating a JournalNode, fail over to nn2-alt

hdfs haadmin -failover nn1 nn2-alt

Step 10: If you are relocating a JournalNode, restart nn1

Restart the NameNode daemon on nn1 to force it to re-read the configuration:
$ sudo service hadoop-hdfs-namenode stop
$ sudo service hadoop-hdfs-namenode start

Disabling HDFS High Availability

If you need to unconfigure HA and revert to using a single NameNode, either permanently or for upgrade or testing purposes, proceed as follows.
  Important:

If you have been using NFS shared storage in CDH 4, you must unconfigure it before upgrading to CDH 5. Only Quorum-based storage is supported in CDH 5. If you already using Quorum-based storage, you do not need to unconfigure it in order to upgrade.

Step 1: Shut Down the Cluster

  1. Shut down Hadoop services across your entire cluster. Do this from Cloudera Manager; or, if you are not using Cloudera Manager, run the following command on every host in your cluster:
    $ for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done
  2. Check each host to make sure that there are no processes running as the hdfs, yarn, mapred or httpfs users from root:
    # ps -aef | grep java

Step 2: Unconfigure HA

  1. Disable the software configuration.
    • If you are using Quorum-based storage and want to unconfigure it, unconfigure the HA properties described under Configuring Software for HDFS HA.

      If you intend to redeploy HDFS HA later, comment out the HA properties rather than deleting them.

    • If you were using NFS shared storage in CDH 4, you must unconfigure the properties described below before upgrading to CDH 5.
  2. Move the NameNode metadata directories on the standby NameNode. The location of these directories is configured by dfs.namenode.name.dir and/or dfs.namenode.edits.dir. Move them to a backup location.

Step 3: Restart the Cluster

for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x start ; done

Properties to unconfigure to disable an HDFS HA configuration using NFS shared storage

  Important:

HDFS HA with NFS shared storage is not supported in CDH 5. Comment out or delete these properties before attempting to upgrade your cluster to CDH 5. (If you intend to configure HA with Quorum-based storage under CDH 5, you should comment them out rather than deleting them, as they are also used in that configuration.)

Unconfigure the following properties.
  • In your core-site.xml file:

    fs.defaultFS (formerly fs.default.name)

    Optionally, you may have configured the default path for Hadoop clients to use the HA-enabled logical URI. For example, if you used mycluster as the NameService ID as shown below, this will be the value of the authority portion of all of your HDFS paths.

    <property>
      <name>fs.default.name/name>
      <value>hdfs://mycluster</value>
    </property>
  • In your hdfs-site.xml configuration file:

    dfs.nameservices

    <property>
      <name>dfs.nameservices</name>
      <value>mycluster</value>
    </property>
      Note:

    If you are also using HDFS Federation, this configuration setting will include the list of other nameservices, HA or otherwise, as a comma-separated list.

    dfs.ha.namenodes.[nameservice ID]

    A list of comma-separated NameNode IDs used by DataNodes to determine all the NameNodes in the cluster. For example, if you used mycluster as the NameService ID, and you used nn1 and nn2 as the individual IDs of the NameNodes, you would have configured this as follows:

    <property>
      <name>dfs.ha.namenodes.mycluster</name>
      <value>nn1,nn2</value>
    </property>

    dfs.namenode.rpc-address.[nameservice ID]

    For both of the previously-configured NameNode IDs, the full address and RPC port of the NameNode process. For example:

    <property>
      <name>dfs.namenode.rpc-address.mycluster.nn1</name>
      <value>machine1.example.com:8020</value>
    </property>
    <property>
      <name>dfs.namenode.rpc-address.mycluster.nn2</name>
      <value>machine2.example.com:8020</value>
    </property>
      Note:

    You may have similarly configured the servicerpc-address setting.

    dfs.namenode.http-address.[nameservice ID]

    The addresses for both NameNodes' HTTP servers to listen on. For example:

    <property>
      <name>dfs.namenode.http-address.mycluster.nn1</name>
      <value>machine1.example.com:50070</value>
    </property>
    <property>
      <name>dfs.namenode.http-address.mycluster.nn2</name>
      <value>machine2.example.com:50070</value>
    </property>
      Note:

    If you have Hadoop's Kerberos security features enabled, and you use HSFTP, you will have set the https-address similarly for each NameNode.

    dfs.namenode.shared.edits.dir

    The path to the remote shared edits directory which the Standby NameNode uses to stay up-to-date with all the file system changes the Active NameNode makes. You should have configured only one of these directories, mounted read/write on both NameNode machines. The value of this setting should be the absolute path to this directory on the NameNode machines. For example:

    <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>file:///mnt/filer1/dfs/ha-name-dir-shared</value>
    </property>

    dfs.client.failover.proxy.provider.[nameservice ID]

    The name of the Java class which the DFS Client uses to determine which NameNode is the current Active, and therefore which NameNode is currently serving client requests. The only implementation which shipped with Hadoop is the ConfiguredFailoverProxyProvider. For example:

    <property>
      <name>dfs.client.failover.proxy.provider.mycluster</name>
      <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    dfs.ha.fencing.methods - a list of scripts or Java classes which will be used to fence the Active NameNode during a failover.
      Note:

    If you implemented your own custom fencing method, see the org.apache.hadoop.ha.NodeFencer class.

    • The sshfence fencing method

      sshfence - SSH to the Active NameNode and kill the process

      For example:

      <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
      </property>
      
      <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/exampleuser/.ssh/id_rsa</value>
      </property>
      Optionally, you may have configured a non-standard username or port to perform the SSH, as shown below, and also a timeout, in milliseconds, for the SSH:
      <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence([[username][:port]])</value>
      </property>
      <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
        <description>
          SSH connection timeout, in milliseconds, to use with the builtin
          sshfence fencer.
        </description>
      </property>
    • The shell fencing method

      shell - run an arbitrary shell command to fence the Active NameNode

      The shell fencing method runs an arbitrary shell command, which you may have configured as shown below:
      <property>
        <name>dfs.ha.fencing.methods</name>
        <value>shell(/path/to/my/script.sh arg1 arg2 ...)</value>
      </property>
Automatic failover: If you configured automatic failover, you configured two additional configuration parameters.
  • In your hdfs-site.xml:
    <property>
      <name>dfs.ha.automatic-failover.enabled</name>
      <value>true</value>
    </property>
  • In your core-site.xml file, add:

    <property>
      <name>ha.zookeeper.quorum</name>
      <value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181</value>
    </property>

Other properties: There are several other configuration parameters which you may have set to control the behavior of automatic failover, though they were not necessary for most installations. See the configuration section of the Hadoop documentation for details.

Redeploying HDFS High Availability

If you need to redeploy HA using Quorum-based storage after temporarily disabling it, proceed as follows:

  1. Shut down the cluster as described in Step 1 of the previous section.
  2. Uncomment the properties you commented out in Step 2 of the previous section.
  3. Deploy HDFS HA, following the instructions under Deploying HDFS High Availability.
Page generated September 3, 2015.