This is the documentation for Cloudera Manager 5.0.x. Documentation for other versions is available at Cloudera Documentation.

Upgrading from CDH 4 to CDH 5 Parcels

This topic covers upgrading a CDH 4 cluster to a CDH 5 cluster using the upgrade wizard, which will install CDH 5 parcels. Your CDH 4 cluster can be using either parcels or packages; you can use the cluster upgrade wizard to upgrade using parcels in either case.

If you want to upgrade using CDH 5 packages, you can do so using a manual process. See Upgrading from CDH 4 Packages to CDH 5 Packages.

The steps to upgrade a CDH installation managed by Cloudera Manager using parcels are as follows.

  1. Before You Begin
  2. Stop All Services
  3. Perform Service-Specific Prerequisite Actions
  4. Run the Upgrade Wizard
  5. Import MapReduce Configuration to YARN
  6. Remove CDH Packages and Update Symlinks
  7. Restart the Reports Manager Role
  8. Finalize the HDFS Metadata Upgrade

Before You Begin

  • Read the Cloudera Manager Release Notes.
  • Make sure there are no Oozie workflows in RUNNING or SUSPENDED status; otherwise the Oozie database upgrade will fail and you will have to reinstall CDH 4 to complete or kill those running workflows.
  • Plan downtime. If you are upgrading a cluster that is part of a production system, be sure to plan ahead. As with any operational work, be sure to reserve a maintenance window with enough extra time allotted in case of complications. The Hadoop upgrade process is well understood, but it is best to be cautious. For production clusters, Cloudera recommends allocating up to a full day maintenance window to perform the upgrade, depending on the number of hosts, the amount of experience you have with Hadoop and Linux, and the particular hardware you are using.
  • To avoid generating many alerts during the upgrade process, you can enable maintenance mode on your cluster before you start the upgrade. Be sure to exit maintenance mode when you have finished the upgrade, in order to re-enable Cloudera Manager alerts.
  • Put the NameNode into safe mode. To upgrade CDH in multiple clusters, repeat this process for each cluster:
    1. In the Cloudera Manager Admin Console, go the HDFS service, NameNode role instance.
    2. Select Actions > Enter Safemode... and confirm that you want to do this.
    3. After the NameNode has successfully entered safemode, select Actions > Save Namespace... and confirm that you want to do this. This will result in a new fsimage being written out with no edit log entries. Leave the NameNode in safe mode while you proceed with the upgrade instructions.
  • Ensure Java 7 is installed across the cluster. CDH 5 requires Java 7, and some services may not start if it is not installed. For installation instructions and recommendations for CDH 5, see (CDH 5) Java Development Kit Installation.
  • Back up important databases:
    • Cloudera Manager databases. For instructions, see Backing up Databases. You will need to indicate to the upgrade wizard that you have performed this step before the upgrade will proceed.
    • Hive Metastore database (which could be in the embedded database)
    • Hue database
    • Oozie database
    • Sqoop database
  • If you have just upgraded to Cloudera Manager 5, you must hard restart the Cloudera Manager Agents as described in the Hard Restart Cloudera Manager Agents task in Upgrading Cloudera Manager 4 to Cloudera Manager 5 in Cloudera Manager Administration Guide.

Stop All Services

  1. Stop each cluster.
    1. On the Home page, click to the right of the cluster name and select Stop.
    2. Click Stop in the confirmation screen. The Command Details window shows the progress of stopping services.

      When All services successfully stopped appears, the task is complete and you can close the Command Details window.

  2. Stop the Cloudera Management Service:
    1. Do one of the following:
        1. Select Clusters > Cloudera Management Service > mgmt.
        2. Select Actions > Stop.
        1. On the Home page, click to the right of mgmt and select Stop.
    2. Click Stop to confirm. The Command Details window shows the progress of stopping the roles.
    3. When Command completed with n/n successful subcommands appears, the task is complete. Click Close.

Perform Service-Specific Prerequisite Actions

  • Accumulo - if you have installed the Accumulo parcel, deactivate it following the instructions in Managing Parcels.
  • HDFS - Back up HDFS metadata on the NameNode:
    1. Stop the NameNode you want to back up.
    2. Go to the HDFS service.
    3. Select Configuration > View and Edit.
    4. In the Search field, search for "NameNode Data Directories". This locates the NameNode Data Directories property.
    5. From the command line on the NameNode host, back up the directory listed in the NameNode Data Directories property. If more than one is listed, then you only need to make a backup of one directory, since each directory is a complete copy. For example, if the data directory is /mnt/hadoop/hdfs/name, do the following as root:
      # cd /mnt/hadoop/hdfs/name
      # tar -cvf /root/nn_backup_data.tar .

      You should see output like this:

      ./
      ./current/
      ./current/fsimage
      ./current/fstime
      ./current/VERSION
      ./current/edits
      ./image/
      ./image/fsimage
        Warning: If you see a file containing the word lock, the NameNode is probably still running. Repeat the preceding steps, starting by shutting down the CDH services.

Run the Upgrade Wizard

The first step of the upgrade process is to download and distribute the parcel for the versions of CDH that you want to install. CDH 5 parcels include Impala and Search, so it is not necessary to add Impala or Search parcels separately:
  1. Log into the Cloudera Manager Admin console.
  2. From the Home tab Status page, click next to the cluster name and select Upgrade Cluster. The Upgrade Wizard starts.
  3. Click the checkbox to acknowledge that you have backed up all your databases and click Continue.
  4. The next step shows you the hosts that the Upgrade Wizard has detected as needing to be upgraded.
  5. Select Use Parcels as your install method, and select the parcel you want to install. Do not select Use Packages. This option only works if you have previously installed the CDH 5 packages. If those packages are not present the upgrade wizard will not continue. To upgrade to CDH 5 using packages, see Upgrading from CDH 4 Packages to CDH 5 Packages
  6. Click Continue to initiate the parcel download and distribution step.
  7. When your parcels have been downloaded and distributed successfully, click Continue.
  8. The next page notifies you that the services on your cluster will be shut down. Rolling upgrade is not available. You can select whether to have all your services restarted and client configurations deployed automatically after the upgrade has finished. Click Continue to proceed.
  9. The upgrade wizard proceeds to execute the various steps involved in upgrading your cluster, which includes:
    • Waiting for the Cloudera Manager Agent to recognize the new CDH version
    • Converting your configuration parameters
    • Upgrading HDFS metadata, Sqoop server, Hive metastore, and various databases
    • Deploying client configuration and restarting services, if you elected those options
      Note: If you encounter errors during these steps:
    • If the converting configuration parameters step fails, Cloudera Manager rolls back all configurations to CDH 4. Fix any reported problems and retry the upgrade.
    • If the upgrade command fails at any point after the convert configuration step, there is no retry support in Cloudera Manager. You must first correct the error, then manually re-run the individual commands. You can view the remaining commands in the Recent Commands page.
    • If the HDFS upgrade metadata step fails, you cannot revert back to CDH 4 unless you restore a backup of Cloudera Manager.
  10. When the upgrade has finished, the Host Inspector runs. This should now show that the hosts are running CDH 5. Click Continue to proceed.

    If your cluster name includes the string "CDH 4" the upgrade procedure changes the string to "CDH 5". Otherwise, it leaves the cluster name unchanged. If you want to rename the cluster, you can do so by clicking the cluster name, which displays a pop-up where you can change the name.

Import MapReduce Configuration to YARN

In CDH 5 and Cloudera Manager 5, YARN rather than MapReduce is the default MapReduce computation framework. If you had the MapReduce service configured in CDH 4, you can import the MapReduce configuration to YARN. This does not affect your MapReduce configuration.

  Warning: In addition to importing configuration settings, the import process:
  • Configures services to use YARN as the MapReduce computation framework instead of MapReduce.
  • Overwrites existing YARN configuration and role assignments.
  1. To import the existing configuration from your MapReduce service, select OK, set up YARN to add the YARN service and import the MapReduce settings. To skip the import, select Skip this step now. If you choose to skip this step, you can perform it at a later time from the YARN service.
  2. Click Continue to proceed. Cloudera Manager stops the YARN service (if running) and its dependencies. When these commands complete, click Continue.
  3. The next page indicates some additional configuration required by YARN. Verify or modify these and click Continue.
  4. The Switch Cluster to MR2 step proceeds. When all steps have been completed, click Continue.
  5. When all steps have complete, click Continue.

Remove CDH Packages and Update Symlinks

If your previous installation of CDH was done using packages, you must remove those packages on all hosts on which you installed the parcels and refresh the symlinks so that clients will run the new software versions. This will definitely be the case if you are running a version of CDH prior to CDH 4.1.3, since parcels were not available with those releases.
  1. Uninstall the CDH packages.
    Operating System Command
    RHEL $ sudo yum remove bigtop-jsvc bigtop-utils bigtop-tomcat hue-common sqoop2-client hbase-solr-doc solr-doc
    SLES $ sudo zypper remove bigtop-jsvc bigtop-utils bigtop-tomcat hue-common sqoop2-client hbase-solr-doc solr-doc
    Ubuntu or Debian $ sudo apt-get purge bigtop-jsvc bigtop-utils bigtop-tomcat hue-common sqoop2-client hbase-solr-doc solr-doc
  2. If you were previously using packages rather than parcels prior to this upgrade, you must restart all the Cloudera Manager Agents to force an update of the symlinks to point to the newly installed components. On each host:
    $ sudo service cloudera-scm-agent restart

Restart the Reports Manager Role

  1. Do one of the following:
    • Select Clusters > Cloudera Management Service > mgmt.
    • On the Status tab of the Home page, in Cloudera Management Service table, click the mgmt link.
  2. Click the Instances tab.
  3. Check the checkbox next to reportsmanager.
  4. Select Actions for Selected > Restart and then Restart to confirm.

Finalize the HDFS Metadata Upgrade

After ensuring that the CDH 5 upgrade has succeeded and that everything is running smoothly, finalize the HDFS metadata upgrade. It is not unusual to wait days or even weeks before finalizing the upgrade.
  1. In the Cloudera Manager Admin Console, pull down the Clusters tab and go to the HDFS service.
  2. Go to the Instances tab and click on the NameNode instance.
  3. From the NameNode Status page, from the Actions menu click Finalize Metadata Upgrade.
  4. Click Finalize Metadata Upgrade to confirm you want to complete this process.

    Cloudera Manager finalizes the metadata upgrade.

Page generated September 3, 2015.