When upgrading HDFS, you’ll need to upgrade both NameNode processes:
Upgrade the standby process
Failover from active to standby
Upgrade the new standby process (formerly the active process)
In the following steps, "NN1" refers to your active NameNode process. "NN2" refers to
your current standby NameNode process. (Note: the NameNode Web UI is at
http://namenode-name:50070/
.)
Switch the current standby process, NN2, to the new software version:
Shut down the current standby process NN2 and corresponding ZKFC process, switch NN2 to the new version, and start NN2. Note that the start command includes the "-rollingUpgrade started" option. NN2 will start and become the standby node again.
su - hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh stop namenode"
su - hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh stop zkfc"
hdp-select set hadoop-hdfs-namenode 2.2.6.0-2800
su - hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode -rollingUpgrade started;"
Verify that NN2 is running successfully and in standby mode. Use the hdfs haadmin CLI to check state, or use the Web UI to check state and confirm the new version.
If the service-ID of NN2 is configured as ‘nn2’, then the following command should return "standby":
su - hdfs -c "hdfs haadmin -getServiceState nn2"
(Service-IDs are defined in the dfs.ha.namenodes.<cluster-name> property in the hdfs-site.xml file. For more information, see Configure NameNode HA Cluster in the "NameNode High Availability" section of the Hadoop High Availability Guide.)
Start ZKFC:
su - hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start zkfc"
Wait until NN2 is out of safe mode before proceeding. Review status info on the Web UI (
http://namenode-name:50070/
).
Failover from NN1 (active) to NN2. With NN1 now standby, switch NN1 to the new software version:
Force a failover from NN1 (currently active) to NN2, so that NN2 becomes active and NN1 becomes standby:
su - hdfs -c "hdfs haadmin -failover <from-serviceid> <to-serviceid>"
For example:
su - hdfs -c "hdfs haadmin -failover nn1 nn2" Failover to NameNode at hdp1.lcl/172.16.226.128:8020 successful
Shut down NN1 and the corresponding ZKFC process, switch NN1 to the new version, and start NN1 as standby (again using the
-rollingUpgrade started
option):su - hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh stop namenode"
su - hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh stop zkfc"
hdp-select set hadoop-hdfs-namenode 2.2.6.0-2800
su - hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode -rollingUpgrade started;"
Verify that NN1 is running successfully and in standby mode. Use the
hdfs haadmin
CLI to check state. If, for example, the service-ID of NN1 isnn1
, then the following command should return "standby":su - hdfs -c "hdfs haadmin -getServiceState nn1"
Start ZKFC:
su - hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start zkfc"