3.2. 2. Deploy NameNode HA Cluster

In this section, we use NN1 to denote the original NameNode in the non-HA setup, and NN2 to denote the other NameNode that is to be added in the HA setup.

[Note]Note

HA clusters reuse the nameservice ID to identify a single HDFS instance (that may consist of multiple HA NameNodes).

A new abstraction called NameNode ID is added with HA. Each NameNode in the cluster has a distinct NameNode ID to distinguish it.

To support a single configuration file for all of the NameNodes, the relevant configuration parameters are suffixed with both the nameservice ID and the NameNode ID.

  1. Start the JournalNode daemons on those set of machines where the JNs are deployed. On each machine, execute the following command:

    su –l hdfs –c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh start journalnode"
  2. Wait for the daemon to start on each of the JN machines.

  3. Initialize JournalNodes.

    • At the NN1 host machine, execute the following command:

      su –l hdfs –c "namenode -initializeSharedEdits -force"

      This command formats all the JournalNodes. This by default happens in an interactive way: the command prompts users for “Y/N” input to confirm the format. You can skip the prompt by using option -force or -nonInteractive.

      It also copies all the edits data after the most recent checkpoint from the edits directories of the local NameNode (NN1) to JournalNodes.

    • At the host with the journal node (if it is separated from the primary host), execute the following command:

      su –l hdfs –c "namenode -initializeSharedEdits -force"
    • Initialize HA state in ZooKeeper. Execute the following command on NN1:

      hdfs zkfc -formatZK -force

      This command creates a znode in ZooKeeper. The failover system stores uses this znode for data storage.

    • Check to see if Zookeeper is running. If not, start Zookeeper by executing the following command on the ZooKeeper host machine(s).

      su - zookeeper -c "export ZOOCFGDIR=/usr/hdp/current/zookeeper-server/conf ; export ZOOCFG=zoo.cfg; source /usr/hdp/current/zookeeper-server/conf/zookeeper-env.sh ; /usr/hdp/current/zookeeper-server/bin/zkServer.sh start"
    • At the standby namenode host, execute the following command:

      su -l hdfs -c "namenode -bootstrapStandby -force"
  4. Start NN1. At the NN1 host machine, execute the following command:

    su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode"

    Make sure that NN1 is running correctly.

  5. Format NN2 and copy the latest checkpoint (FSImage) from NN1 to NN2 by executing the following command:

    su -l hdfs -c "namenode -bootstrapStandby -force"

    This command connects with HH1 to get the namespace metadata and the checkpointed fsimage. This command also ensures that NN2 receives sufficient editlogs from the JournalNodes (corresponding to the fsimage). This command fails if JournalNodes are not correctly initialized and cannot provide the required editlogs.

  6. Start NN2. Execute the following command on the NN2 host machine:

    su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode"

    Ensure that NN2 is running correctly.

  7. Start DataNodes. Execute the following command on all the DataNodes:

    su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-datanode/../hadoop/sbin/hadoop-daemon.sh start datanode"
  8. Validate the HA configuration.

    Go to the NameNodes' web pages separately by browsing to their configured HTTP addresses. Under the configured address label, you should see that HA state of the NameNode. The NameNode can be either in "standby" or "active" state.

    [Note]Note

    The HA NameNode is initially in the Standby state after it is bootstrapped. You can also use either JMX (tag.HAState) to query the HA state of a NameNode. The following command can also be used to query the HA state of a NameNode:

    hdfs haadmin -getServiceState

  9. Transition one of the HA NameNode to Active state.

    Initially, both NN1 and NN2 are in Standby state. Therefore you must transition one of the NameNode to Active state. This transition can be performed using one of the following options:

    • Option I - Using CLI Use the command line interface (CLI) to transition one of the NameNode to Active State. Execute the following command on that NameNode host machine:

      hdfs haadmin -failover --forcefence --forceactive <serviceId> <namenodeId>

      For more information on the haadmin command, see "Appendix: Administrative Commands."

    • Option II - Deploying Automatic Failover You can configure and deploy automatic failover using the instructions provided in Configure and Deploy NameNode Automatic Failover.