4. Upgrading the Stack (from 1.2.* to 1.3.3)

  1. Update the stack version in the Server database, depending on if you are using a local repository:

    ambari-server upgradestack HDP-1.3.3
  2. Upgrade the HDP repository on all hosts and replace the old repository file with the new file:

    [Important]Important

    The file you download is named hdp.repo. To function properly in the system, it must be named HDP.repo. Once you have completed the "mv" of the new repository file to the repos.d folder, make sure there is no file named hdp.repo anywhere in your repos.d folder.

    • For RHEL/CentOS/Oracle Linux 5

      wget  http://public-repo-1.hortonworks.com/HDP/centos5/1.x/updates/1.3.3.0/hdp.repo
      mv hdp.repo /etc/yum.repos.d/HDP.repo
    • For RHEL/CentOS/Oracle Linux 6

      wget http://public-repo-1.hortonworks.com/HDP/centos6/1.x/updates/1.3.3.0/hdp.repo 
      mv hdp.repo /etc/yum.repos.d/HDP.repo
    • For SLES 11

      wget  http://public-repo-1.hortonworks.com/HDP/suse11/1.x/updates/1.3.3.0/hdp.repo
      mv hdp.repo /etc/zypp/repos.d/HDP.repo
  3. Upgrade the stack on all Agent hosts. Skip any components your installation does not use:

    • For RHEL/CentOS/Oracle Linux

      1. Upgrade the following components:

        yum upgrade "collectd*" "epel-release*" "gccxml*" "pig*" "hadoop*" "sqoop*" "zookeeper*" "hbase*" "hive*" "hcatalog*" "webhcat-tar*" hdp_mon_nagios_addons
      2. Check to see that the components in that list are upgraded.

        yum list installed | grep HDP-$old-stack-version-number

        None of the components from that list should appear in the returned list.

      3. Upgrade Oozie, if you are using Oozie:

        rpm -e --nopostun oozie-$old_version_number 
        yum install oozie 

        You can get the value of $old_version_number from the output of the previous step.

      4. Upgrade Oozie Client:

        yum upgrade oozie-client
      5. Upgrade ZooKeeper.

        1. Check to see if ZooKeeper needs upgrading.

          yum list installed | grep zookeeper

          If the displayed version number is not 3.4.5.1.3.2.0, you need to upgrade.

        2. Because HBase depends on ZooKeeper, deleting the current version of ZooKeeper automatically deletes the current version of HBase. It must be re-installed. Check to see if HBase is currently installed.

          yum list installed | grep hbase
        3. Delete the current version of ZooKeeper.

          yum erase zookeeper
        4. Install ZooKeeper.

          yum install zookeeper
        5. If you need to, re-install HBase.

          yum install hbase
        6. Check to see if all components have been upgraded.

          yum list installed | grep HDP-$old-stack-version-number

          The only non-upgraded component you may see in this list is extjs, which does not need to be upgraded.

    • For SLES

      1. Upgrade the following components:

        zypper up collectd epel-release* gccxml* pig* hadoop* sqoop* hive* hcatalog* webhcat-tar* hdp_mon_nagios_addons*
        yast --update hadoop hcatalog hive
      2. Upgrade ZooKeeper and HBase.

        zypper update zookeeper-3.4.5.1.3.2.0
        zypper remove zookeeper
        zypper se -s zookeeper

        You should see ZooKeeper v3.4.5.1.3.2.0 in the output.

        Install ZooKeeper v3.4.5.1.3.2.0:

        zypper install zookeeper-3.4.5.1.3.2.0

        This command also uninstalls HBase. Now use the following commands to install HBase:

        zypper install hbase-0.94.6.1.3.2.0
        zypper update hbase
      3. Upgrade Oozie:

        rpm -e --nopostun oozie-$old_version_number
        zypper update oozie-3.3.2.1.3.2.0
        zypper remove oozie
        zypper se -s oozie 

        You should see Oozie v3.3.2.1.3.2.0 in the output.

        Install Oozie v3.3.2.1.3.2.0:

        zypper install oozie-3.3.2.1.3.2.0
  4. Start the Ambari Server. On the Server host:

    ambari-server start
  5. Start each Ambari Agent. On all Agent hosts:

    ambari-agent start
  6. Because the file system version has now changed you must start the NameNode manually. On the NameNode host:

    sudo su -l $HDFS_USER -c "/usr/lib/hadoop/bin/hadoop-daemon.sh start namenode -upgrade"

    Depending on the size of your system, this step may take up to 10 minutes.

  7. Track the status of the upgrade:

    hadoop dfsadmin -upgradeProgress status

    Continue tracking until you see:

    Upgrade for version -44 has been completed.
    Upgrade is not finalized.
    [Note]Note

    You finalize the upgrade later. DO NOT run the balancer before finalizing an upgrade. No block deletion occurs until you finalize the upgrade. Running the balancer before finalizing an upgrade may duplicate data blocks, and increase disk usage.

  8. Open the Ambari Web GUI. If you have continued to run the Ambari Web GUI, do a hard reset on your browser. Use Services View to start the HDFS service. This starts the SecondaryNameNode and the DataNodes.

  9. After the DataNodes are started, HDFS exits safemode. To monitor the status:

    hadoop dfsadmin -safemode get

    Depending on the size of your system, this may take up to 10 minutes or so. When HDFS exits safemode, this is displayed as a response to the command:

    Safe mode is OFF
  10. Make sure that the HDFS upgrade was successful. Go through steps 2 and 3 in Preparing for the Upgrade to create new versions of the logs and reports. Substitute "new" for "old" in the file names as necessary

  11. Compare the old and new versions of the following:

    • dfs-old-fsck-1.log versus dfs-new-fsck-1.log.

      The files should be identical unless the hadoop fsck reporting format has changed in the new version.

    • dfs-old-lsr-1.log versus dfs-new-lsr-1.log.

      The files should be identical unless the the format of hadoop fs -lsr reporting or the data structures have changed in the new version.

    • dfs-old-report-1.log versus fs-new-report-1.log

      Make sure all DataNodes previously belonging to the cluster are up and running.

  12. Use the Ambari Web Services view-> Services Navigation->Start All to start services back up.

  13. The upgrade is now fully functional but not yet finalized. Using the finalize command removes the previous version of the NameNode and DataNode's storage directories.

    [Important]Important

    After the upgrade is finalized, the system cannot be rolled back. Usually this step is not taken until a thorough testing of the upgrade has been performed.

    The upgrade must be finalized before another upgrade can be performed.

    [Note]Note

    Directories used by Hadoop 1 services set in /etc/hadoop/conf/taskcontroller.cfg are not automatically deleted after upgrade. Administrators can choose to delete these directories after the upgrade.

    To finalize the upgrade:

    sudo su -l $HDFS_USER -c "hadoop dfsadmin -finalizeUpgrade"

    where $HDFS_USER is the HDFS Service user (by default, hdfs).