3. Complete the Upgrade of the 2.0 Stack to 2.2

  1. Start Ambari Server.

    On the Server host, ambari-server start

  2. Start all Ambari Agents.

    On each Ambari Agent host, ambari-agent start

  3. Update the repository Base URLs in the Ambari Server for the HDP 2.2.0 stack.

    Browse to Ambari Web > Admin > Repositories, then set the value of the HDP and HDP-UTILS repository Base URLs. For more information about viewing and editing repository Base URLs, see Viewing Cluster Stack Version and Repository URLs.


    For a remote, accessible, public repository, the HDP and HDP-UTILS Base URLs are the same as the baseurl=values in the HDP.repo file downloaded in Upgrade the Stack: Step 1. For a local repository, use the local repository Base URL that you configured for the HDP Stack. For links to download the HDP repository files for your version of the Stack, see HDP Stack Repositories.

  4. Update the respective configurations.

    1. Go to the Upgrade Folder you created when Preparing the 2.0 Stack for Upgrade.

    2. Execute the update-configs action:

      python upgradeHelper.py --hostname $HOSTNAME --user $USERNAME --password $PASSWORD --clustername $CLUSTERNAME --fromStack=$FROMSTACK --toStack=$TOSTACK --upgradeCatalog=$UPGRADECATALOG update-configs [configuration item]


      <HOSTNAME> is the name of the Ambari Server host <USERNAME> is the admin user for Ambari Server <PASSWORD> is the password for the admin user <CLUSTERNAME> is the name of the cluster <FROMSTACK> is the version number of pre-upgraded stack, for example 2.0 <TOSTACK> it the version number of the upgraded stack, for example 2.2.x <UPGRADECATALOG> is the path to the upgrade catalog file, for example UpgradeCatalog_2.0_to_2.2.x.json

      For example, To update all configuration items:

      python upgradeHelper.py --hostname $HOSTNAME --user $USERNAME --password $PASSWORD --clustername $CLUSTERNAME --fromStack=2.0 --toStack=2.2.x --upgradeCatalog=UpgradeCatalog_2.0_to_2.2.x.json update-configs

      To update configuration item hive-site:

      python upgradeHelper.py --hostname $HOSTNAME --user $USERNAME --password $PASSWORD --clustername $CLUSTERNAME --fromStack=2.0 --toStack=2.2.x --upgradeCatalog=UpgradeCatalog_2.0_to_2.2.x.json update-configs hive-site

  5. Using the Ambari Web UI > Services, start the ZooKeeper service.

  6. At all Datanode and Namenode hosts, copy (rewrite) old hdfs configurations to new conf directory:

    cp /etc/hadoop/conf.empty/hdfs-site.xml.rpmsave /etc/hadoop/conf/hdfs-site.xml; cp /etc/hadoop/conf.empty/hadoop-env.sh.rpmsave /etc/hadoop/conf/hadoop-env.sh; cp /etc/hadoop/conf.empty/log4j.properties.rpmsave /etc/hadoop/conf/log4j.properties; cp /etc/hadoop/conf.empty/core-site.xml.rpmsave /etc/hadoop/conf/core-site.xml

  7. If you are upgrading from an HA NameNode configuration, start all JournalNodes.

    On each JournalNode host, run the following command:

    su -l <HDFS_USER> -c "/usr/hdp/2.2.x.x-<$version>/hadoop/sbin/hadoop-daemon.sh start journalnode"where <HDFS_USER> is the HDFS Service user. For example, hdfs.


    All JournalNodes must be running when performing the upgrade, rollback, or finalization operations. If any JournalNodes are down when running any such operation, the operation will fail.

  8. Because the file system version has now changed, you must start the NameNode manually.

    On the active NameNode host, as the HDFS user:

    su -l <HDFS_USER> -c "export HADOOP_LIBEXEC_DIR=/usr/hdp/2.2.x.x-<$version>/hadoop/libexec && /usr/hdp/2.2.x.x-<$version>/hadoop/sbin/hadoop-daemon.sh start namenode -upgrade"

    To check if the Upgrade is in progress, check that the "\previous" directory has been created in \NameNode and \JournalNode directories. The "\previous" directory contains a snapshot of the data before upgrade.


    In a NameNode HA configuration, this NameNode will not enter the standby state as usual. Rather, this NameNode will immediately enter the active state, perform an upgrade of its local storage directories, and also perform an upgrade of the shared edit log. At this point, the standby NameNode in the HA pair is still down. It will be out of sync with the upgraded active NameNode.

    To synchronize the active and standby NameNode, re-establishing HA, re-bootstrap the standby NameNode by running the NameNode with the '-bootstrapStandby' flag. Do NOT start this standby NameNode with the '-upgrade' flag.

    As the HDFS user:

    su -l <HDFS_USER> -c "hdfs namenode -bootstrapStandby -force"w

    The bootstrapStandby command will download the most recent fsimage from the active NameNode into the <dfs.name.dir> directory of the standby NameNode. You can enter that directory to make sure the fsimage has been successfully downloaded. After verifying, start the ZKFailoverController via Ambari, then start the standby NameNode via Ambari. You can check the status of both NameNodes using the Web UI.

  9. Start all DataNodes.

    On each DataNode, as the HDFS user,

    su -l <HDFS_USER> -c "/usr/hdp/2.2.x.x-<$version>/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode"

    where <HDFS_USER> is the HDFS Service user. For example, hdfs. The NameNode sends an upgrade command to DataNodes after receiving block reports.

  10. Restart HDFS.

    1. Open the Ambari Web GUI. If the browser in which Ambari is running has been open throughout the process, clear the browser cache, then refresh the browser.

    2. Choose Ambari Web > Services > HDFS > Service Actions > Restart All.


      In a cluster configured for NameNode High Availability, use the following procedure to restart NameNodes. Using the following procedure preserves HA when upgrading the cluster.

      1. Using Ambari Web > Services > HDFS, choose Active NameNode.

        This shows the host name of the current, active NameNode.

      2. Write down (or copy, or remember) the host name of the active NameNode.

        You need this host name for step 4.

      3. Using Ambari Web > Services > HDFS > Service Actions > choose Stop.

        This stops all of the HDFS Components, including both NameNodes.

      4. Using Ambari Web > Hosts > choose the host name you noted in Step 2, then start that NameNode component, using Host Actions > Start.

        This causes the original, active NameNode to re-assume its role as the active NameNode.

      5. Using Ambari Web > Services > HDFS > Service Actions, choose Re-Start All.

    3. Choose Service Actions > Run Service Check. Makes sure the service checks pass.

  11. After the DataNodes are started, HDFS exits safe mode. Monitor the status, by running the following command, as the HDFS user:

    sudo su -l <HDFS_USER> -c "hdfs dfsadmin -safemode get"

    When HDFS exits safe mode, the following message displays:

    Safe mode is OFF

  12. Make sure that the HDFS upgrade was successful.

    • Compare the old and new versions of the following log files:

      • dfs-old-fsck-1.log versus dfs-new-fsck-1.log.

        The files should be identical unless the hadoop fsck reporting format has changed in the new version.

      • dfs-old-lsr-1.log versus dfs-new-lsr-1.log.

        The files should be identical unless the format of hadoop fs -lsr reporting or the data structures have changed in the new version.

      • dfs-old-report-1.log versus fs-new-report-1.log.

        Make sure that all DataNodes in the cluster before upgrading are up and running.

  13. Using Ambari Web, navigate to Services > Hive > Configs > Advanced and verify that the following properties are set to their default values:

    Hive (Advanced)

    The Security Wizard enables Hive authorization. The default values for these properties changed in Hive-0.12. If you are upgrading Hive from 0.12 to 0.13 in a secure cluster, you should not need to change the values. If upgrading from Hive-older than version 0.12 to Hive-0.12 or greater in a secure cluster, you will need to correct the values.

  14. Update Hive Configuration Properties for HDP 2.2.x

    Using Ambari Web UI > Services > Hive > Configs > hive-site.xml:

    • hive-site





      <!-- The ZooKeeper token store connect string. -->


      <!-- List of zookeeper servers to talk to -->

    • webhcat-site




      <!-- Properties to set when running hive -->


    Pay Attention: Values should not contain white space after each comma.

  15. If YARN is installed in your HDP 2.0 stack, and the Application Timeline Server (ATS) components are NOT, then you must create and install ATS service and host components via API by running the following commands on the server that will host the YARN application timeline server in your cluster. Be sure to replace <your_ATS_component_hostname> with a host name appropriate for your envrionment.


    Ambari does not currently support ATS in a kerberized cluster. If you are upgrading YARN in a kerberized cluster, skip this step.

    • Create the ATS Service Component.

      curl --user admin:admin -H "X-Requested-By: ambari" -i -X POST http://localhost:8080/api/v1/clusters/<your_cluster_name>/services/YARN/components/APP_TIMELINE_SERVER
    • Create the ATS Host Component.

      curl --user admin:admin -H "X-Requested-By: ambari" -i -X POST http://localhost:8080/api/v1/clusters/<your_cluster_name>/hosts/<your_ATS_component_hostname>/host_components/APP_TIMELINE_SERVER 
    • Install the ATS Host Component.

      curl --user admin:admin -H "X-Requested-By: ambari" -i -X PUT -d '{ "HostRoles": { "state": "INSTALLED"}}' http://localhost:8080/api/v1/clusters/<your_cluster_name>/hosts/<your_ATS_component_hostname>/host_components/APP_TIMELINE_SERVER

    curl commands use the default username/password = admin/admin. To run the curl commands using non-default credentials, modify the --user option to use your Ambari administrator credentials. For example: --user <ambari_admin_username>:<ambari_admin_password>.

  16. Make the following config changes required for Application Timeline Server. Use the Ambari web UI to navigate to the service dashboard and add/modify the following configurations:

    YARN (Custom yarn-site.xml)
    HIVE (hive-site.xml)

    *If mapreduce.map.memory.mb > 2GB then set it equal to mapreduce.map.memory. Otherwise, set it equal to

    hive.tez.java.opts="-server -Xmx" + Math.round(0.8 * map-container-size) + "m -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC"

    Use configuration values appropriate for your environment. For example, the value "800" in the preceding example is shown only for illustration purposes.

  17. Prepare MR2 and Yarn for work. Execute hdfs commands on any host.

    • Create mapreduce dir in hdfs.

      su -l <HDFS_USER> -c "hdfs dfs -mkdir -p /hdp/apps/2.2.x.x-<$version>/mapreduce/"

    • Copy new mapreduce.tar.gz to hdfs mapreduce dir.

      su -l <HDFS_USER> -c "hdfs dfs -copyFromLocal /usr/hdp/2.2.x.x-<$version>/hadoop/mapreduce.tar.gz /hdp/apps/2.2.x.x-<$version>/mapreduce/."
    • Grant permissions for created mapreduce dir in hdfs.

      su -l <HDFS_USER> -c "hdfs dfs -chown -R <HDFS_USER>:<HADOOP_GROUP> /hdp";
      su -l <HDFS_USER> -c "hdfs dfs -chmod -R 555 /hdp/apps/2.2.x.x-<$version>/mapreduce";
      su -l <HDFS_USER> -c "hdfs dfs -chmod -R 444 /hdp/apps/2.2.x.x-<$version>/mapreduce/mapreduce.tar.gz"
    • Using Ambari Web UI > Service > Yarn > Configs > Advanced > yarn-site. Add/modify the following property:




      <!-- List of zookeeper servers to talk to -->


      <!-- Zookeeper server to talk to -->


      <!-- Timeline service fqdn address -->


      <!-- Timeline service webapp fqdn address -->


      <!-- Timeline service https webapp fqdn address -->

  18. Using Ambari Web > Services > Service Actions, start YARN.

  19. Using Ambari Web > Services > Service Actions, start MapReduce2.

  20. Using Ambari Web > Services > Service Actions, start HBase and ensure the service check passes.

  21. Using Ambari Web > Services > Service Actions, start the Hive service.

  22. Upgrade Oozie.

    1. Perform the following preparation steps on each Oozie server host:


      You must replace your Oozie configuration after upgrading.

      1. Copy configurations from oozie-conf-bak to the /etc/oozie/conf directory on each Oozie server and client.

      2. Create /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22 directory.

        mkdir /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22

      3. Copy the JDBC jar of your Oozie database to both /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22 and /usr/hdp/2.2.x.x-<$version>/oozie/libtools. For example, if you are using MySQL, copy your mysql-connector-java.jar.

      4. Copy these files to /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22 directory

        cp /usr/lib/hadoop/lib/hadoop-lzo*.jar /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22; cp /usr/share/HDP-oozie/ext-2.2.zip /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22; cp /usr/share/HDP-oozie/ext-2.2.zip /usr/hdp/2.2.x.x-<$version>/oozie/libext

      5. Grant read/write access to the Oozie user.

        chmod -R 777 /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22

    2. Upgrade steps:

      1. On the Services view, make sure that YARN and MapReduce2 services are running.

      2. Make sure that the Oozie service is stopped.

      3. In oozie-env.sh, comment out CATALINA_BASE property, also do the same using Ambari Web UI in Services > Oozie > Configs > Advanced oozie-env.

      4. Upgrade Oozie.

        At the Oozie server host, as the Oozie service user:

        sudo su -l <OOZIE_USER> -c"/usr/hdp/2.2.x.x-<$version>/oozie/bin/ooziedb.sh upgrade -run" where <OOZIE_USER> is the Oozie service user. For example, oozie.

        Make sure that the output contains the string "Oozie DB has been upgraded to Oozie version <OOZIE_Build_Version>.

      5. Prepare the Oozie WAR file.


        The Oozie server must be not running for this step. If you get the message "ERROR: Stop Oozie first", it means the script still thinks it's running. Check, and if needed, remove the process id (pid) file indicated in the output.

        At the Oozie server, as the Oozie user sudo su -l <OOZIE_USER> -c "/usr/hdp/2.2.x.x-<$version>/oozie/bin/oozie-setup.sh prepare-war -d /usr/hdp/2.2.x.x-<$version>/oozie/libext-upgrade22"where <OOZIE_USER> is the Oozie service user. For example, oozie.

        Make sure that the output contains the string "New Oozie WAR file added".

      6. Using Ambari Web, choose Services > Oozie > Configs, expand oozie-log4j, then add the following property:

        log4j.appender.oozie.layout.ConversionPattern=%d{ISO8601} %5p %c{1}:%L - SERVER[${oozie.instance.id}] %m%n where ${oozie.instance.id} is determined by oozie, automatically.

      7. Using Ambari Web, choose Services > Oozie > Configs, expand Advanced oozie-site, then edit the following properties:

        1. In oozie.service.coord.push.check.requeue.interval, replace the existing property value with the following one:


        2. In oozie.service.SchemaService.wf.ext.schemas, append (using copy/paste) to the existing property value the following string, if is it is not already present:



          If you have customized schemas, append this string to your custom schema name string.

          Do not overwrite custom schemas.

          If you have no customized schemas, you can replace the existing string with the following one:


        3. In oozie.service.URIHandlerService.uri.handlers, append to the existing property value the following string, if is it is not already present:

        4. In oozie.services, make sure all the following properties are present:

        5. Add the oozie.services.coord.check.maximum.frequency property with the following property value: false

          If you set this property to true, Oozie rejects any coordinators with a frequency faster than 5 minutes. It is not recommended to disable this check or submit coordinators with frequencies faster than 5 minutes: doing so can cause unintended behavior and additional system stress.

        6. Add the oozie.service.AuthorizationService.security.enabled property with the following property value: false

          Specifies whether security (user name/admin role) is enabled or not. If disabled any user can manage Oozie system and manage any job.

        7. Add the oozie.service.HadoopAccessorService.kerberos.enabled property with the following property value: false

          Indicates if Oozie is configured to use Kerberos.

        8. Add the oozie.authentication.simple.anonymous.allowed property with the following property value: true

          Indicates if anonymous requests are allowed. This setting is meaningful only when using 'simple' authentication.

        9. In oozie.services.ext, append to the existing property value the following string, if is it is not already present:

        10. After modifying all properties on the Oozie Configs page, choose Save to update oozie.site.xml, using the updated configurations.

      8. Replace the content of /usr/oozie/share in HDFS. On the Oozie server host:

        • Extract the Oozie sharelib into a tmp folder.

          mkdir -p /tmp/oozie_tmp; cp /usr/hdp/2.2.x.x-<$version>/oozie/oozie-sharelib.tar.gz /tmp/oozie_tmp; cd /tmp/oozie_tmp; tar xzvf oozie-sharelib.tar.gz;

        • Back up the /user/oozie/share folder in HDFS and then delete it. If you have any custom files in this folder, back them up separately and then add them to the /share folder after updating it.

          mkdir /tmp/oozie_tmp/oozie_share_backup; chmod 777 /tmp/oozie_tmp/oozie_share_backup;

          su -l <HDFS_USER> -c "hdfs dfs -copyToLocal /user/oozie/share /tmp/oozie_tmp/oozie_share_backup"; su -l <HDFS_USER> -c "hdfs dfs -rm -r /user/oozie/share"; where <HDFS_USER> is the HDFS service user. For example, hdfs.

        • Add the latest share libs that you extracted in step 1. After you have added the files, modify ownership and acl.

          su -l <HDFS_USER> -c "hdfs dfs -copyFromLocal /tmp/oozie_tmp/share /user/oozie/."; su -l <HDFS_USER> -c "hdfs dfs -chown -R <OOZIE_USER>:<HADOOP_GROUP> /user/oozie"; su -l <HDFS_USER> -c "hdfs dfs -chmod -R 755 /user/oozie"; where <HDFS_USER> is the HDFS service user. For example, hdfs.

      9. Add the Falcon Service, using Ambari Web > Services > Actions > +Add Service. Without Falcon, Oozie will fail.

      10. Use the Ambari Web UI > Services view to start the Oozie service. Make sure that ServiceCheck passes for Oozie.

  23. Update WebHCat.

    1. Expand Advanced > webhcat-site.xml.

      Check if templeton.hive.properties is set correctly.

    2. On each WebHCat host, update the Pig and Hive tar bundles, by updating the following files:

      • /apps/webhcat/pig.tar.gz

      • /apps/webhcat/hive.tar.gz


        Find these files only on a host where WebHCat is installed.

      For example, to update a *.tar.gz file:

      • Move the file to a local directory.

        su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -copyToLocal /apps/webhcat/*.tar.gz <local_backup_dir>"

      • Remove the old file.

        su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -rm /apps/webhcat/*.tar.gz"

      • Copy the new file.

        su -l <HCAT_USER> -c "hdfs --config /etc/hadoop/conf dfs -copyFromLocal /usr/hdp/2.2.x.x-<$version>/hive/hive.tar.gz /apps/webhcat/"; su -l <HCAT_USER> -c "hdfs --config /etc/hadoop/conf dfs -copyFromLocal /usr/hdp/2.2.x.x-<$version>/pig/pig.tar.gz /apps/webhcat/"; where <HCAT_USER> is the HCatalog service user. For example, hcat.

    3. On each WebHCat host, update /app/webhcat/hadoop-streaming.jar file.

      • Move the file to a local directory.

        su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -copyToLocal /apps/webhcat/hadoop-streaming*.jar <local_backup_dir>"

      • Remove the old file.

        su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -rm /apps/webhcat/hadoop-streaming*.jar"

      • Copy the new hadoop-streaming.jar file.

        su -l <HCAT_USER> -c "hdfs --config /etc/hadoop/conf dfs -copyFromLocal /usr/hdp/2.2.x.x-<$version>/hadoop-mapreduce/hadoop-streaming*.jar /apps/webhcat"where <HCAT_USER> is the HCatalog service user. For example, hcat.

  24. Prepare Tez for work. Add the Tez service to your cluster using the Ambari Web UI, if Tez was not installed earlier.


    The Tez client should also be installed on the Pig host.

    Configure Tez.

    cd /var/lib/ambari-server/resources/scripts/; ./configs.sh set localhost <your-cluster-name> cluster-env "tez_tar_source""/usr/hdp/current/tez-client/lib/tez.tar.gz"; ./configs.sh set localhost <your-cluster-name> cluster-env "tez_tar_destination_folder""hdfs:///hdp/apps/{{ hdp_stack_version }}/tez/"

    If you use Tez as the Hive execution engine, and if the variable hive.server2.enabled.doAs is set to true, you must create a scratch directory on the NameNode host for the username that will run the HiveServer2 service. For example, use the following commands:

    sudo su -c "hdfs -makedir /tmp/hive- <username> " sudo su -c "hdfs -chmod 777 /tmp/hive- <username> "where <username> is the name of the user that runs the HiveServer2 service.

  25. Using the Ambari Web UI> Services > Hive, start the Hive service.

  26. If you use Tez as the Hive execution engine, and if the variable hive.server2.enabled.doAs is set to true, you must create a scratch directory on the NameNode host for the username that will run the HiveServer2 service. For example, use the following commands:

    sudo su -c "hdfs -makedir /tmp/hive-<username>"

    sudo su -c "hdfs -chmod 777 /tmp/hive-<username>"

    where <username> is the name of the user that runs the HiveServer2 service.

  27. Using Ambari Web > Services, re-start the remaining services.

  28. The upgrade is now fully functional but not yet finalized. Using the finalize command removes the previous version of the NameNode and DataNode storage directories.


    After the upgrade is finalized, the system cannot be rolled back. Usually this step is not taken until a thorough testing of the upgrade has been performed.

    The upgrade must be finalized before another upgrade can be performed.


    Directories used by Hadoop 1 services set in /etc/hadoop/conf/taskcontroller.cfg are not automatically deleted after upgrade. Administrators can choose to delete these directories after the upgrade.

    To finalize the upgrade, execute the following command once, on the primary NameNode host in your HDP cluster: sudo su -l <HDFS_USER> -c "hdfs dfsadmin -finalizeUpgrade"