For HDP 2.2, the configuration files were stored in
/etc/hadoop/conf
. Starting with HDP 2.3, the configuration files are stored in/etc/hadoop
but in a sub-directory specific to the HDP version being used. To perform the HDFS upgrade, we need to copy the existing configuration files into place on every NameNode and DataNode:cp /etc/hadoop/conf/* /etc/hadoop/2.3.x.y-z/0/
After copying configurations to the 2.3 configuration location, save the old HDFS configuration and add symlink from /etc/hadoop/conf:
mv /etc/hadoop/conf /etc/hadoop/conf.saved
ln -s /usr/hdp/current/hadoop-client/conf /etc/hadoop/conf
ls -la /etc/hadoop
total 4 drwxr-xr-x 3 root root 4096 Jun 19 21:51 2.3.0.0-2323 lrwxrwxrwx 1 root root 35 Jun 19 21:54 conf -> /usr/hdp/current/hadoop-client/conf drwxr-xr-x 2 root root 4096 Jun 14 00:11 conf.saved
If you are upgrading from an HA NameNode configuration, start all JournalNodes. At each JournalNode host, run the following command:
su -l <HDFS_USER> -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start journalnode"
where
<HDFS_USER>
is the HDFS Service user. For example, hdfs.Important All JournalNodes must be running when performing the upgrade, rollback, or finalization operations. If any JournalNodes are down when running any such operation, the operation will fail.
If you are upgrading from an HA NameNode configuration, start the ZK Failover Controllers.
su -l <HDFS_USER> -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start zkfc"
where
<HDFS_USER>
is the HDFS Service user. For example, hdfs.Because the file system version has now changed, you must start the NameNode manually. On the active NameNode host, as the HDFS user,
su -l <HDFS_USER>
-c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start namenode -upgrade"
where
<HDFS_USER>
is the HDFS Service user. For example, hdfs.Note In a large system, this can take a long time to complete. Run this command with the
-upgrade
option only once. After you have completed this step, you can bring up the NameNode using this command without including the-upgrade
option.To check if the Upgrade is progressing, check that the
${dfs.namenode.name.dir}/previous
directory has been created. The${dfs.namenode.name.dir}/previous
directory contains a snapshot of the data before upgrade.Note In a NameNode HA configuration, this NameNode does not enter the standby state as usual. Rather, this NameNode immediately enters the active state, upgrades its local storage directories, and upgrades the shared edit log. At this point, the standby NameNode in the HA pair is still down, and not synchronized with the upgraded, active NameNode.
To re-establish HA, you must synchronize the active and standby NameNodes. To do so, bootstrap the standby NameNode by running the NameNode with the '-bootstrapStandby' flag. Do NOT start the standby NameNode with the '-upgrade' flag.
At the Standby NameNode,
su -l <HDFS_USER> -c "hdfs namenode -bootstrapStandby -force"
where<HDFS_USER>
is the HDFS Service user. For example, hdfs.The bootstrapStandby command downloads the most recent fsimage from the active NameNode into the
<dfs.name.dir>
directory on the standby NameNode. Optionally, you can access that directory to make sure the fsimage has been successfully downloaded. After verifying, start the ZKFailoverController, then start the standby NameNode usingAmbari Web > Hosts > Components
.Verify that the NameNode is up and running:
ps -ef | grep -i NameNode
Start all DataNodes.
At each DataNode, as the HDFS user,
su -l <HDFS_USER> -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start datanode"
where
<HDFS_USER>
is the HDFS Service user. For example, hdfs.The NameNode sends an upgrade command to DataNodes after receiving block reports.
Verify that the DataNode process is up and running:
ps -ef | grep DataNode
Restart HDFS. Restarting HDFS will push out the upgraded configurations to all HDFS services.
Open the Ambari Web. If the browser in which Ambari is running has been open throughout the process, clear the browser cache, then refresh the browser.
Browse to Services > HDFS, and from the Service Actions menu, select Restart All.
If you are running an HA NameNode configuration, use the following procedure to restart NameNodes.
Browse to
Services > HDFS
. The Summary section of the page shows which host is the active NameNode.Hover over the Active NameNode and write down/remember the hostname of the host. You will need this hostname later.
From the Service Actions menu, select Stop. This stops all of the HDFS Components, including both NameNodes.
Browse to Hosts and select the host that was running the Active NameNode (as noted in the previous step). Using the Actions menu next to the NameNode component, select Start. This causes the original Active NameNode to re-assume it’s role as the Active NameNode .
Browse to
Services > HDFS
and from the Service Actions menu, select Restart All.
After HDFS has started, run the service check. Browse to
Services>HDFS
and from the Service Actions menu, select,Run Service Check
.
After the DataNodes are started, HDFS exits SafeMode. To monitor the status, run the following command, on any DataNode:
su -l <HDFS_USER> -c "hdfs dfsadmin -safemode get"
where
<HDFS_USER>
is the HDFS Service user. For example, hdfs.When HDFS exits SafeMode, the following message displays:
Safe mode is OFF
Note In general, it takes 5-10 minutes to get out of safemode. For thousands of nodes with millions of data blocks, getting out of safemode can take up to 45 minutes.
Make sure that the HDFS upgrade was successful. Optionally, repeat step 4 in Checkpoint HDFS to create new versions of the logs and reports, substituting "-
new
" for "-old
" in the file names as necessary.Compare the old and new versions of the following log files:
dfs-old-fsck-1.log
versusdfs-new-fsck-1.log
.The files should be identical unless the hadoop fsck reporting format has changed in the new version.
dfs-old-lsr-1.log
versusdfs-new-lsr-1.log
.The files should be identical unless the format of hadoop fs -lsr reporting or the data structures have changed in the new version.
dfs-old-report-1.log
versusfs-new-report-1.log
Make sure that all DataNodes in the cluster before upgrading are up and running.
From the NameNode WebUI , determine if all DataNodes are up and running.
http://<namenode>:<namenodeport>
If you are on a highly available HDFS cluster, go to the Standby NameNode web UI to see if all DataNodes are up and running:
http://<standbynamenode>:<namenodeport>
If you are not on a highly available HDFS cluster, go to the SecondaryNameNode web UI to see if it the secondary node is up and running:
http://<secondarynamenode>:<secondarynamenodeport>