Warning | |
---|---|
Before you start HDFS on an HA system you must start the ZooKeeper service. If you do not start the ZKFC, there can be failures. To start HDFS, run commands as the $HDFS_USER. |
If you are upgrading from an HA NameNode configuration, start all JournalNodes. On each JournalNode host, run the following command:
su -l <HDFS_USER> -c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh start journalnode"
Important All JournalNodes must be running when performing the upgrade, rollback, or finalization operations. If any JournalNodes are down when running any such operation, the operation fails.
To successfully upgrade to HDP 2.2, you must change the value of hadoop.rpc.protection to authentication in core-site.xml:
<property> <name>hadoop.rpc.protection</name> <value>authentication</value> </property>
Start the NameNode.
Because the file system version has now changed you must start the NameNode manually. On the active NameNode host, run the following command:
su -l <HDFS_USER> -c "export HADOOP_LIBEXEC_DIR=/usr/hdp/current/hadoop-hdfs-namenode/sbin/hadoop-daemon.sh start namenode -upgrade"
On a large system, this can take a long time to complete.
Note Run this command with the -upgrade option only once. After you have completed this step, you can bring up the NameNode using this command without including the
-upgrade
option.To check if the Upgrade is in progress, check that the
\previous
directory has been created in \NameNode and \JournalNode directories. The\previous
directory contains a snapshot of the data before upgrade.In a NameNode HA configuration, this NameNode will not enter the standby state as usual. Rather, this NameNode will immediately enter the active state, perform an upgrade of its local storage directories, and also perform an upgrade of the shared edit log. At this point, the standby NameNode in the HA pair is still down. It will be out of sync with the upgraded active NameNode. To synchronize the active and standby NameNode, re-establishing HA, re-bootstrap the standby NameNode by running the NameNode with the '-bootstrapStandby' flag. Do NOT start this standby NameNode with the '-upgrade' flag.
su -l <HDFS_USER> -c "hdfs namenode -bootstrapStandby -force"
where <HDFS_USER> is the HDFS service user. For example, hdfs.
The
bootstrapStandby
command will download the most recent fsimage from the active NameNode into the$dfs.name.dir
directory of the standby NameNode. You can enter that directory to make sure the fsimage has been successfully downloaded. After verifying, start the ZKFailoverController, then start the standby NameNode. You can check the status of both NameNodes using the Web UI.Verify that the NameNode is up and running:
ps -ef|grep -i NameNode
Start the Secondary NameNode. On the Secondary NameNode host machine, run the following command:
su <HDFS_USER>
export HADOOP_LIBEXEC_DIR=/usr/hdp/current/hadoop-hdfs-secondarynamenode/../hadoop/libexec/usr/hdp/current/hadoop-hdfs-secondarynamenode/../hadoop/sbin/hadoop-daemon.sh start secondarynamenode
Verify that the Secondary NameNode is up and running:
ps -ef|grep SecondaryNameNode
Start DataNodes.
On each of the DataNodes, enter the following command. If you are working on a non-secure DataNode, use $HDFS_USER. For a secure DataNode, use root.
su -l <HDFS_USER> -c "/usr/hdp/current/hadoop-hdfs-datanode/../hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode"
Verify that the DataNode process is up and running:
ps -ef|grep DataNode
Verify that NameNode can go out of safe mode.
<hdfs dfsadmin -safemode wait
Safemode is OFF
In general, it takes 5-10 minutes to get out of safemode. For thousands of nodes with millions of data blocks, getting out of safemode can take up to 45 minutes.