3. Complete the Upgrade of the 2.1 Stack to 2.2

  1. Start Ambari Server. On the Ambari Server host,

    ambari-server start

  2. Start all Ambari Agents. At each Ambari Agent host,

    ambari-agent start

  3. Update the repository Base URLs in Ambari Server for the HDP-2.2 stack.

    Browse to Ambari Web > Admin > Repositories, then set the values for the HDP and HDP-UTILS repository Base URLs. For more information about viewing and editing repository Base URLs, see Managing Stacks and Versions.

    [Important]Important

    For a remote, accessible, public repository, the HDP and HDP-UTILS Base URLs are the same as the baseurl=values in the HDP.repo file downloaded in Upgrade the 2.1 Stack to 2.2: Step 1. For a local repository, use the local repository Base URL that you configured for the HDP Stack. For links to download the HDP repository files for your version of the Stack, see HDP Stack Repositories.

  4. Update the respective configurations.

    1. Go to the Upgrade Folder you created when Preparing the 2.1 Stack for Upgrade.

    2. Execute the update-configs action:

      python upgradeHelper.py --hostname $HOSTNAME --user $USERNAME --password $PASSWORD --clustername $CLUSTERNAME --fromStack=$FROMSTACK --toStack=$TOSTACK --upgradeCatalog=$UPGRADECATALOG update-configs [configuration item]

      Where

      <HOSTNAME> is the name of the Ambari Server host <USERNAME> is the admin user for Ambari Server <PASSWORD> is the password for the admin user <CLUSTERNAME> is the name of the cluster <FROMSTACK> is the version number of pre-upgraded stack, for example 2.1 <TOSTACK> it the version number of the upgraded stack, for example 2.2.x <UPGRADECATALOG> is the path to the upgrade catalog file, for example UpgradeCatalog_2.1_to_2.2.x.json

      For example, To update all configuration items:

      python upgradeHelper.py --hostname $HOSTNAME --user $USERNAME --password $PASSWORD --clustername $CLUSTERNAME --fromStack=2.1 --toStack=2.2.x --upgradeCatalog=UpgradeCatalog_2.1_to_2.2.x.json update-configs

      To update configuration item hive-site:

      python upgradeHelper.py --hostname $HOSTNAME --user $USERNAME --password $PASSWORD --clustername $CLUSTERNAME --fromStack=2.1 --toStack=2.2.x --upgradeCatalog=UpgradeCatalog_2.1_to_2.2.x.json update-configs hive-site

  5. Using the Ambari Web UI, add the Tez service if if it has not been installed already. For more information about adding a service, see Adding a Service.

  6. Using the Ambari Web UI, add any new services that you want to run on the HDP 2.2.x stack. You must add a Service before editing configuration properties necessary to complete the upgrade.

  7. Using the Ambari Web UI > Services, start the ZooKeeper service.

  8. Copy (rewrite) old hdfs configurations to new conf directory, on all Datanode and Namenode hosts,

    cp /etc/hadoop/conf.empty/hdfs-site.xml.rpmsave /etc/hadoop/conf/hdfs-site.xml;

    cp /etc/hadoop/conf.empty/hadoop-env.sh.rpmsave /etc/hadoop/conf/hadoop-env.sh;

    cp /etc/hadoop/conf.empty/log4j.properties.rpmsave /etc/hadoop/conf/log4j.properties;

    cp /etc/hadoop/conf.empty/core-site.xml.rpmsave /etc/hadoop/conf/core-site.xml

  9. If you are upgrading from an HA NameNode configuration, start all JournalNodes.

    At each JournalNode host, run the following command:

    su -l <HDFS_USER> -c "/usr/hdp/2.2.x.x-<$version>/hadoop/sbin/hadoop-daemon.sh start journalnode"

    where <HDFS_USER> is the HDFS Service user. For example, hdfs.

    [Important]Important

    All JournalNodes must be running when performing the upgrade, rollback, or finalization operations. If any JournalNodes are down when running any such operation, the operation will fail.

    All JournalNodes must be running when performing the upgrade, rollback, or finalization operations. If any JournalNodes are down when running any such operation, the operation will fail.

  10. Because the file system version has now changed, you must start the NameNode manually. On the active NameNode host, as the HDFS user,

    su -l <HDFS_USER>-c "export HADOOP_LIBEXEC_DIR=/usr/hdp/2.2.x.x-<$version>/hadoop/libexec && /usr/hdp/2.2.x.x-<$version>/hadoop/sbin/hadoop-daemon.sh start namenode -upgrade"where <HDFS_USER> is the HDFS Service user. For example, hdfs.

    To check if the Upgrade is progressing, check that the "\previous" directory has been created in \NameNode and \JournalNode directories. The "\previous" directory contains a snapshot of the data before upgrade.

    [Note]Note

    In a NameNode HA configuration, this NameNode does not enter the standby state as usual. Rather, this NameNode immediately enters the active state, upgrades its local storage directories, and upgrades the shared edit log. At this point, the standby NameNode in the HA pair is still down, and not synchronized with the upgraded, active NameNode.

    To re-establish HA, you must synchronize the active and standby NameNodes. To do so, bootstrap the standby NameNode by running the NameNode with the '-bootstrapStandby' flag. Do NOT start the standby NameNode with the '-upgrade' flag.

    At the Standby NameNode,

    su -l <HDFS_USER> -c "hdfs namenode -bootstrapStandby -force"where <HDFS_USER> is the HDFS Service user. For example, hdfs.

    The bootstrapStandby command downloads the most recent fsimage from the active NameNode into the <dfs.name.dir> directory on the standby NameNode. Optionally, you can access that directory to make sure the fsimage has been successfully downloaded. After verifying, start the ZKFailoverController, then start the standby NameNode using Ambari Web > Hosts > Components.

  11. Start all DataNodes.

    At each DataNode, as the HDFS user,

    su -l <HDFS_USER> -c "/usr/hdp/2.2.x.x-<$version>/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode"

    where <HDFS_USER> is the HDFS Service user. For example, hdfs.

    The NameNode sends an upgrade command to DataNodes after receiving block reports.

  12. Restart HDFS.

    • Open the Ambari Web GUI. If the browser in which Ambari is running has been open throughout the process, clear the browser cache, then refresh the browser.

    • Choose Ambari Web > Services > HDFS > Service Actions > Restart All.

      [Important]Important

      In a cluster configured for NameNode High Availability, use the following procedure to restart NameNodes. Using the following procedure preserves HA when upgrading the cluster.

      1. Using Ambari Web > Services > HDFS, choose Active NameNode.

        This shows the host name of the current, active NameNode.

      2. Write down (or copy, or remember) the host name of the active NameNode.

        You need this host name for step 4.

      3. Using Ambari Web > Services > HDFS > Service Actions > choose Stop.

        This stops all of the HDFS Components, including both NameNodes.

      4. Using Ambari Web > Hosts choose the host name you noted in Step 2, then start that NameNode component, using Host Actions > Start.

        This causes the original, active NameNode to re-assume its role as the active NameNode.

      5. Using Ambari Web > Services > HDFS > Service Actions, choose Re-Start All.

    • Choose Service Actions > Run Service Check. Makes sure the service check passes.

  13. After the DataNodes are started, HDFS exits SafeMode. To monitor the status, run the following command, on each DataNode:

    sudo su -l <HDFS_USER> -c "hdfs dfsadmin -safemode get"

    where <HDFS_USER> is the HDFS Service user. For example, hdfs.

    When HDFS exits SafeMode, the following message displays:

    Safe mode is OFF

  14. Make sure that the HDFS upgrade was successful. Optionally, repeat step 5 in Prepare the 2.1 Stack for Upgrade to create new versions of the logs and reports, substituting "-new" for "-old" in the file names as necessary.

    • Compare the old and new versions of the following log files:

      • dfs-old-fsck-1.log versus dfs-new-fsck-1.log.

        The files should be identical unless the hadoop fsck reporting format has changed in the new version.

      • dfs-old-lsr-1.log versus dfs-new-lsr-1.log.

        The files should be identical unless the format of hadoop fs -lsr reporting or the data structures have changed in the new version.

      • dfs-old-report-1.log versus fs-new-report-1.log

        Make sure that all DataNodes in the cluster before upgrading are up and running.

  15. If YARN is installed in your HDP 2.1 stack, and the Application Timeline Server (ATS) component is NOT, then you must create and install ATS component using the API

    Run the following commands on the server that will host the YARN ATS in your cluster. Be sure to replace <your_ATS_component_hostname> with a host name appropriate for your environment.

    1. Create the ATS Service Component.

      curl --user admin:admin -H "X-Requested-By: ambari" -i -X POST http://localhost:8080/api/v1/clusters/<your_cluster_name>/services/YARN/components/APP_TIMELINE_SERVER
    2. Create the ATS Host Component.

      curl --user admin:admin -H "X-Requested-By: ambari" -i -X POST http://localhost:8080/api/v1/clusters/<your_cluster_name>/hosts/<your_ATS_component_hostname>/host_components/APP_TIMELINE_SERVER 
    3. Install the ATS Host Component.

      curl --user admin:admin -H "X-Requested-By: ambari" -i -X PUT -d '{"HostRoles": { "state": "INSTALLED"}}' http://localhost:8080/api/v1/clusters/<your_cluster_name>/hosts/<your_ATS_component_hostname>/host_components/APP_TIMELINE_SERVER 
    [Note]Note

    curl commands use the default username/password = admin/admin. To run the curl commands using non-default credentials, modify the --user option to use your Ambari administrator credentials.

    For example: --user <ambari_admin_username>:<ambari_admin_password>.

    curl commands use the default username/password = admin/admin. To run the curl commands using non-default credentials, modify the --user option to use your Ambari administrator credentials.

    For example: --user <ambari_admin_username>:<ambari_admin_password>.

  16. Prepare MR2 and Yarn for work. Execute HDFS commands on any host.

    • Create mapreduce dir in hdfs.

      su -l <HDFS_USER> -c "hdfs dfs -mkdir -p /hdp/apps/2.2.x.x-<$version>/mapreduce/"

    • Copy new mapreduce.tar.gz to HDFS mapreduce dir.

      su -l <HDFS_USER> -c "hdfs dfs -copyFromLocal /usr/hdp/2.2.x.x-<$version>/hadoop/mapreduce.tar.gz /hdp/apps/2.2.x.x-<$version>/mapreduce/."

    • Grant permissions for created mapreduce dir in hdfs.

      su -l <HDFS_USER> -c "hdfs dfs -chown -R <HDFS_USER>:<HADOOP_GROUP> /hdp";
      su -l <HDFS_USER> -c "hdfs dfs -chmod -R 555 /hdp/apps/2.2.x.x-<$version>/mapreduce";
      su -l <HDFS_USER> -c "hdfs dfs -chmod -R 444 /hdp/apps/2.2.x.x-<$version>/mapreduce/mapreduce.tar.gz"
    • Update YARN Configuration Properties for HDP 2.2.x

      Using Ambari Web UI > Service > Yarn > Configs > Custom > yarn-site:

      • Add

        Name

        Value

        hadoop.registry.zk.quorum

        <!--List of hostname:port pairs defining the zookeeper quorum binding for the registry-->

        yarn.resourcemanager.zk-address

        localhost:2181

    • Update Hive Configuration Properties for HDP 2.2.x

      • Using Ambari Web UI > Services > Hive > Configs > Advanced webhcat-site:

        Find the templeton.hive.properties property and remove whitespaces after "," from the value.

      • Using Ambari Web UI > Services > Hive > Configs > hive-site.xml:

        • Add

          Name

          Value

          hive.cluster.delegation.token.store.zookeeper.connectString

          <!-- The ZooKeeper token store connect string. -->

          hive.zookeeper.quorum

          <!-- List of zookeeper server to talk to -->

  17. Using Ambari Web > Services > Service Actions, start YARN.

  18. UsingAmbari Web > Services > Service Actions, start MapReduce2.

  19. Using Ambari Web > Services > Service Actions, start HBase and ensure the service check passes.

  20. Using Ambari Web > Services > Service Actions, start the Hive service.

  21. Upgrade Oozie.

    1. Perform the following preparation steps on each Oozie server host:

      [Note]Note

      You must replace your Oozie configuration after upgrading.

      1. Copy configurations from oozie-conf-bak to the /etc/oozie/conf directory on each Oozie server and client.

      2. Create /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22 directory.

        mkdir /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22

      3. Copy the JDBC jar of your Oozie database to both /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22 and /usr/hdp/2.2.x.x-<$version>/oozie/libtools.

        For example, if you are using MySQL, copy your mysql-connector-java.jar.

      4. Copy these files to /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22 directory.

        cp /usr/lib/hadoop/lib/hadoop-lzo*.jar /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22; cp /usr/share/HDP-oozie/ext-2.2.zip /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22; cp /usr/share/HDP-oozie/ext-2.2.zip /usr/hdp/2.2.x.x-<$version>/oozie/libext

      5. Grant read/write access to the Oozie user.

        chmod -R 777 /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22

    2. Upgrade steps:

      1. On the Services view, make sure that YARN and MapReduce2 services are running.

      2. Make sure that the Oozie service is stopped.

      3. In /etc/oozie/conf/oozie-env.sh, comment out CATALINA_BASE property, also do the same using Ambari Web UI in Services > Oozie > Configs > Advanced oozie-env.

      4. Upgrade Oozie. At the Oozie database host, as the Oozie service user:

        sudo su -l <OOZIE_USER> -c"/usr/hdp/2.2.x.x-<$version>/oozie/bin/ooziedb.sh upgrade -run"

        where <OOZIE_USER> is the Oozie service user. For example, oozie.

        Make sure that the output contains the string "Oozie DB has been upgraded to Oozie version <OOZIE_Build_Version>.

      5. Prepare the Oozie WAR file.

        [Note]Note

        The Oozie server must be not running for this step. If you get the message "ERROR: Stop Oozie first", it means the script still thinks it's running. Check, and if needed, remove the process id (pid) file indicated in the output. You may see additional "File Not Found" error messages during a successful upgrade of Oozie.

        On the Oozie server, as the Oozie user sudo su -l <OOZIE_USER> -c "/usr/hdp/2.2.x.x-<$version>/oozie/bin/oozie-setup.sh prepare-war -d /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22"

        where <OOZIE_USER> is the Oozie service user. For example, oozie.

        Make sure that the output contains the string "New Oozie WAR file added".

      6. Using Ambari Web, choose Services > Oozie > Configs, expand oozie-log4j, then add the following property:

        log4j.appender.oozie.layout.ConversionPattern=%d{ISO8601} %5p %c{1}:%L - SERVER[${oozie.instance.id}] %m%n

        where ${oozie.instance.id} is determined by Oozie, automatically.

      7. Replace the content of /usr/oozie/share in HDFS.

        On the Oozie server host:

        1. Extract the Oozie sharelib into a tmp folder.

          mkdir -p /tmp/oozie_tmp; cp /usr/hdp/2.2.x.x-<$version>/oozie/oozie-sharelib.tar.gz /tmp/oozie_tmp; cd /tmp/oozie_tmp; tar xzvf oozie-sharelib.tar.gz;

        2. Back up the /user/oozie/share folder in HDFS and then delete it.

          If you have any custom files in this folder, back them up separately and then add them to the /share folder after updating it.

          mkdir /tmp/oozie_tmp/oozie_share_backup; chmod 777 /tmp/oozie_tmp/oozie_share_backup;

          su -l <HDFS_USER> -c "hdfs dfs -copyToLocal /user/oozie/share /tmp/oozie_tmp/oozie_share_backup"; su -l <HDFS_USER> -c "hdfs dfs -rm -r /user/oozie/share";

          where <HDFS_USER> is the HDFS service user. For example, hdfs.

        3. Add the latest share libs that you extracted in step 1. After you have added the files, modify ownership and acl.

          su -l <HDFS_USER> -c "hdfs dfs -copyFromLocal /tmp/oozie_tmp/share /user/oozie/."; su -l <HDFS_USER> -c "hdfs dfs -chown -R <OOZIE_USER>:<HADOOP_GROUP> /user/oozie"; su -l <HDFS_USER> -c "hdfs dfs -chmod -R 755 /user/oozie";

          where <HDFS_USER> is the HDFS service user. For example, hdfs.

  22. Use the Ambari Web UI > Services view to start the Oozie service.

    Make sure that ServiceCheck passes for Oozie.

  23. Update WebHCat.

    1. Modify the webhcat-site config type.

      Using Ambari Web > Services > WebHCat, modify the following configuration:

      Action

      Property Name

      Property Value

      Modify

      templeton.storage.class

      org.apache.hive.hcatalog.templeton.tool.ZooKeeperStorage

    2. Expand Advanced > webhcat-site.xml.

      Check if property templeton.port exists. If not, then add it using the Custom webhcat-site panel. The default value for templeton.port = 50111.

    3. On each WebHCat host, update the Pig and Hive tar bundles, by updating the following files:

      • /apps/webhcat/pig.tar.gz

      • /apps/webhcat/hive.tar.gz

        [Note]Note

        Find these files only on a host where WebHCat is installed.

      For example, to update a *.tar.gz file:

      • Move the file to a local directory.

        su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -copyToLocal /apps/webhcat/*.tar.gz <local_backup_dir>"

      • Remove the old file.

        su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -rm /apps/webhcat/*.tar.gz"

      • Copy the new file.

        su -l <HCAT_USER> -c "hdfs --config /etc/hadoop/conf dfs -copyFromLocal /usr/hdp/2.2.x.x-<$version>/hive/hive.tar.gz /apps/webhcat/"; su -l <HCAT_USER> -c "hdfs --config /etc/hadoop/conf dfs -copyFromLocal /usr/hdp/2.2.x.x-<$version>/pig/pig.tar.gz /apps/webhcat/"; 

        where <HCAT_USER> is the HCatalog service user. For example, hcat.

    4. On each WebHCat host, update /app/webhcat/hadoop-streaming.jar file.

      • Move the file to a local directory.

        su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -copyToLocal /apps/webhcat/hadoop-streaming*.jar <local_backup_dir>"

      • Remove the old file.

        su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -rm /apps/webhcat/hadoop-streaming*.jar"
      • Copy the new hadoop-streaming.jar file.

        su -l <HCAT_USER> -c "hdfs --config /etc/hadoop/conf dfs -copyFromLocal /usr/hdp/2.2.x.x-<$version>/hadoop-mapreduce/hadoop-streaming*.jar /apps/webhcat"

        where <HCAT_USER> is the HCatalog service user. For example, hcat.

  24. If Tez was not installed during the upgrade, you must prepare Tez for work, using the following steps:

    [Important]Important

    The Tez client should be available on the same host with Pig.

    If you use Tez as the Hive execution engine, and if the variable hive.server2.enabled.doAs is set to true, you must create a scratch directory on the NameNode host for the username that will run the HiveServer2 service. If you installed Tez before upgrading the Stack, use the following commands:

    sudo su -c "hdfs -makedir /tmp/hive- <username> " sudo su -c "hdfs -chmod 777 /tmp/hive- <username> "

    where <username> is the name of the user that runs the HiveServer2 service.

    • Put Tez libraries in hdfs. Execute at any host:

      su -l <HDFS_USER&gt -c "hdfs dfs -mkdir -p /hdp/apps/2.2.x.x-<$version>/tez/" 
      su -l <HDFS_USER&gt -c "hdfs dfs -copyFromLocal -f /usr/hdp/2.2.x.x-<$version>/tez/lib/tez.tar.gz /hdp/apps/2.2.x.x-<$version>/tez/."
      su -l <HDFS_USER&gt -c "hdfs dfs -chown -R <HDFS_USER>:<HADOOP_GROUP> /hdp" su -l <HDFS_USER&gt -c "hdfs dfs -chmod -R 555 /hdp/apps/2.2.x.x-<$version>/tez" su -l hdfs -c "hdfs dfs -chmod -R 444 /hdp/apps/2.2.x.x-<$version>/tez/tez.tar.gz"
  25. Prepare the Storm service properties.

    • Edit nimbus.childopts.

      Using Ambari Web UI > Services > Storm > Configs > Nimbus > find nimbus.childopts. Update the path for the jmxetric-1.0.4.jar to: /usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar. If nimbus.childopts property value contains "-Djava.security.auth.login.config=/path/to/storm_jaas.conf", remove this text.

    • Edit supervisor.childopts.

      Using Ambari Web UI > Services > Storm > Configs > Supervisor > find supervisor.childopts. Update the path for the jmxetric-1.0.4.jar to: /usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar. If supervisor.childopts property value contains "-Djava.security.auth.login.config=/etc/storm/conf/storm_jaas.conf", remove this text.

    • Edit worker.childopts.

      Using Ambari Web UI > Services > Storm > Configs > Advanced > storm-site find worker.childopts. Update the path for the jmxetric-1.0.4.jar to: /usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar.

      Check if the _storm.thrift.nonsecure.transport property exists. If not, add it, _storm.thrift.nonsecure.transport = backtype.storm.security.auth.SimpleTransportPlugin, using the Custom storm-site panel.

    • Remove the storm.local.dir from every host where the Storm component is installed.

      You can find this property in the Storm > Configs > General tab.

      rm -rf <storm.local.dir>

    • If you are planning to enable secure mode, navigate to Ambari Web UI > Services > Storm > Configs > Advanced storm-site and add the following property:

      _storm.thrift.secure.transport=backtype.storm.security.auth.kerberos.KerberosSaslTransportPlugin
    • Stop the Storm Rest_API Component.

      curl -u admin:admin -X PUT -H 'X-Requested-By:1' -d '{"RequestInfo":{"context":"Stop Component"},"Body":{"HostRoles":{"state":"INSTALLED"}}}' http://server:8080/api/v1/clusters/c1/hosts/host_name/host_components/STORM_REST_API
      [Note]Note

      In HDP 2.2, STORM_REST_API component was deleted because the service was moved into STORM_UI_SERVER. When upgrading from HDP 2.1 to 2.2, you must delete this component using the API as follows:

    • Delete the Storm Rest_API Component.

      curl -u admin:admin -X DELETE -H 'X-Requested-By:1' http://server:8080/api/v1/clusters/c1/services/STORM/components/STORM_REST_API
  26. Upgrade Pig.

    Copy the the Pig configuration files to /etc/pig/conf.

    cp /etc/pig/conf.dist/pig-env.sh /etc/pig/conf/;

  27. Using Ambari Web UI > Services > Storm, start the Storm service.

  28. Using Ambari Web > Services > Service Actions, re-start all stopped services.

  29. The upgrade is now fully functional but not yet finalized. Using the finalize command removes the previous version of the NameNode and DataNode storage directories.

    [Note]Note

    After the upgrade is finalized, the system cannot be rolled back. Usually this step is not taken until a thorough testing of the upgrade has been performed.

    After the upgrade is finalized, the system cannot be rolled back. Usually this step is not taken until a thorough testing of the upgrade has been performed.

    The upgrade must be finalized before another upgrade can be performed.

    [Note]Note

    After the upgrade is finalized, the system cannot be rolled back. Usually this step is not taken until a thorough testing of the upgrade has been performed.

    Directories used by Hadoop 1 services set in /etc/hadoop/conf/taskcontroller.cfg are not automatically deleted after upgrade. Administrators can choose to delete these directories after the upgrade.

    To finalize the upgrade, execute the following command once, on the primary NameNode host in your HDP cluster, sudo su -l <HDFS_USER> -c "hdfs dfsadmin -finalizeUpgrade"

    where <HDFS_USER> is the HDFS service user. For example, hdfs.