Before You Begin
Before upgrading, be sure to read about the latest Incompatible Changes and Known Issues in CDH 5 in the CDH 5 Release Notes. If you are currently running MRv1, you should read CDH 5 and MapReduce before proceeding.
Plan Downtime
If you are upgrading a cluster that is part of a production system, be sure to plan ahead. As with any operational work, be sure to reserve a maintenance window with enough extra time allotted in case of complications. The Hadoop upgrade process is well understood, but it is best to be cautious. For production clusters, Cloudera recommends allocating up to a full day maintenance window to perform the upgrade, depending on the number of hosts, the amount of experience you have with Hadoop and Linux, and the particular hardware you are using.
Delete Symbolic Links in HDFS
If there are symbolic links in HDFS when you upgrade from CDH 4 to CDH 5, the upgrade will fail and you will have to downgrade to CDH 4, delete the symbolic links, and start over. To prevent this, proceed as follows.
- cd to the directory on the NameNode that contains the latest fsimage The location of this directory is specified as the value of dfs.namenode.name.dir (or dfs.name.dir) in hdfs-site.xml.
- Use a command such as the following to write out the path names
in the
fsimage:
$ hdfs oiv -i FSIMAGE -o /tmp/YYYY-MM-DD_FSIMAGE.txt
- Use a command such as the following to find the path names of
any symbolic links listed in
/tmp/YYYY-MM-DD_FSIMAGE.txt
and write them out to the file
/tmp/symlinks.txt:
$ grep -- "->" /tmp/YYYY-MM-DD_FSIMAGE.txt > /tmp/symlinks.txt
- Delete any symbolic links listed in /tmp/symlinks.txt.
Considerations for Secure Clusters
If you are upgrading a cluster that has Kerberos security enabled, you must do the following:
- Before starting the upgrade, read the CDH 5 Security Guide .
- Before shutting down Hadoop services, put the NameNode into safe mode and perform a saveNamespace operation; see the instructions on backing up the metadata.
High Availability
- For more information and instructions on setting up a new HA configuration, see the
CDH 5 High Availability Guide.Important
: If you decide to configure HA for the NameNode, do not install hadoop-hdfs-secondarynamenode. After completing the HDFS HA software configuration, follow the installation instructions under Deploying HDFS High Availability.
- To upgrade an existing configuration, follow the instructions under Upgrading to CDH 5.
<< Upgrading from CDH 4 to CDH 5 | Upgrading to CDH 5 >> | |