Getting Started Upgrading CDH

Before you upgrade a CDH cluster, you need to gather information, review the limitations and release notes and run some checks on the cluster. See the Collect Information section below. Fill in the My Environment form below to customize your CDH upgrade procedures.

The version of CDH you can upgrade to depends on the version of Cloudera Manager that is managing the cluster. You may need to upgrade Cloudera Manager before upgrading CDH.

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

Loading Filters ... 6.3.4 6.3.3 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 5.16 5.15 5.14 5.13 5.12 5.11 5.10 5.9 5.8 5.7 5.6 5.5 5.4 5.3 5.2 5.1 5.0 6.3.4 6.3.3 6.3.2 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 5.16 5.15 5.14 5.13 5.12 5.11 5.10 5.9 5.8 5.7 5.6 5.5 5.4 5.3 5.2 5.1 5.0

Collect Information

Collect the following information about your environment and fill in the form above. This information will be remembered by your browser on all pages in this Upgrade Guide.

  1. Log in to the Cloudera Manager Server host.
    ssh my_cloudera_manager_server_host
  2. Run the following command to find the current version of the Operating System:
    lsb_release -a
  3. Log in to the Cloudera Manager Admin console and find the following:
    1. The version of Cloudera Manager used in your cluster. Go to Support > About.
    2. The version of the JDK deployed in the cluster. Go to Support > About.
    3. Whether High Availability is enabled for HDFS. Go to the HDFS service and click the Actions button. If you see Disable High Availability, the cluster has High Availability enabled.
    4. The Install Method and Current CDH version. The CDH version number and Install Method are displayed on the Cloudera Manager Home page, to the right of the cluster name.

Preparing to Upgrade CDH

  1. You must have SSH access to the Cloudera Manager server hosts and be able to log in using the root account or an account that has password-less sudo permission to all the hosts.
  2. Review the CDH 5 and Cloudera Manager 5 Requirements and Supported Versions Cloudera Enterprise 6 Requirements and Supported Versions for the new versions you are upgrading to. If your hosts require an operating system upgrade, you must perform the upgrade before upgrading CDH. See Upgrading the Operating System.
  3. Ensure that a supported version of Java is installed on all hosts in the cluster. See Java Requirements. For installation instructions and recommendations, see Upgrading the JDK.
  4. Review the following documents:
  5. Review the upgrade procedure and reserve a maintenance window with enough time allotted to perform all steps. For production clusters, Cloudera recommends allocating up to a full day maintenance window to perform the upgrade, depending on the number of hosts, the amount of experience you have with Hadoop and Linux, and the particular hardware you are using.
  6. If you are upgrading from CDH 5.1 or lower, and use Hive Date partition columns, you might need to update the date format. See Date Partition Columns.
  7. If the cluster uses Impala, check your SQL against the newest reserved words listed in incompatible changes. If upgrading across multiple versions, or in case of any problems, check against the full list of Impala keywords.
  8. Run the Security Inspector and fix any reported errors.

    Go to Administration > Security > Security Inspector.

  9. Log in to any cluster node as the hdfs user, run the following commands, and correct any reported errors:
    hdfs fsck / -includeSnapshots
    hdfs dfsadmin -report
    See HDFS Commands Guide in the Apache Hadoop documentation.
  10. Log in to any DataNode as the hdfs user, run the following command, and correct any reported errors:
    hbase hbck 
    See Using the HBCK2 Tool to Remediate HBase Clusters.
  11. If your cluster uses HBase, see Migrating Apache HBase Before Upgrading to CDH 6.
  12. If the cluster uses Kudu, log in to any cluster host and run the ksck command as the kudu user (sudo -u kudu). If the cluster is Kerberized, first kinit as kudu then run the command:
    kudu cluster ksck <master_addresses>

    For the full syntax of this command, see Checking Cluster Health with ksck.

  13. If you have configured Hue to use TLS/SSL and you are upgrading from CDH 5.2 or lower to CDH 5.3 or higher, Hue validates CA certificates and requires a truststore. To create a truststore, follow the instructions in Hue as a TLS/SSL Client.
  14. If you are upgrading to CDH 6.0 or higher, and Hue is deployed in the cluster, and Hue is using PostgreSQL as its database, you must manually install psycopg2. See Installing Dependencies for Hue.
  15. If your cluster uses the Flume Kafka client, and you are upgrading to CDH 5.8.0 or CDH 5.8.1, perform the extra steps described in Upgrading to CDH 5.8.0 or CDH 5.8.1 When Using the Flume Kafka Client and then continue with the procedures in this topic.
  16. If your cluster uses Impala and Llama, this role has been deprecated as of CDH 5.9 and you must remove the role from the Impala service before starting the upgrade. If you do not remove this role, the upgrade wizard will halt the upgrade.
    To determine if Impala uses Llama:
    1. Go to the Impala service.
    2. Select the Instances tab.
    3. Examine the list of roles in the Role Type column. If Llama appears, the Impala service is using Llama.
    To remove the Llama role:
    1. Go to the Impala service and select Actions > Disable YARN and Impala Integrated Resource Management.

      The Disable YARN and Impala Integrated Resource Management wizard displays.

    2. Click Continue.

      The Disable YARN and Impala Integrated Resource Management Command page displays the progress of the commands to disable the role.

    3. When the commands have completed, click Finish.
  17. If your cluster uses Sentry, and are upgrading from CDH 5.12 or lower, you might need to increase the Java heap memory for Sentry. See Performance Guidelines.
  18. If your cluster uses Sentry and an Oracle database, and you are upgrading from CDH 5.13.0 or higher to CDH 5.16.0 or higher, you must manually add the AUTHZ_PATH.AUTHZ_OBJ_ID index if it does not already exist. Adding the index manually decreases the time Sentry takes to get a full snapshot for HDFS sync. Use the following command to add the index:
    CREATE INDEX "AUTHZ_PATH_FK_IDX" ON "AUTHZ_PATH" ("AUTHZ_OBJ_ID");
  19. The following services are no longer supported as of Enterprise 6.0.0:
    • Accumulo
    • Sqoop 2
    • MapReduce 1
    • Spark 1.6
    • Record Service
    You must stop and delete these services before upgrading CDH. See Stopping a Service on All Hosts and Deleting Services.
  20. Open the Cloudera Manager Admin console and collect the following information about your environment:
    1. The version of Cloudera Manager. Go to Support > About.
    2. The version of the JDK deployed. Go to Support > About.
    3. The version of CDH and whether the cluster was installed using parcels or packages. It is displayed next to the cluster name on the Home page.
    4. The services enabled in your cluster.

      Go to Clusters > Cluster name.

    5. Whether HDFS High Availability is enabled.

      Go to Clusters click HDFS Service, click Actions menu. It is enabled if you see an menu item Disable High Availability.

  21. Back up Cloudera Manager before beginning the upgrade. See Backing Up Cloudera Manager.
  22. Review all CDH 6 pre-upgrade migration steps. There are steps you must perform before beginning the upgrade for the following components: Sentry, Cloudera Search, Apache Spark, HBase, Hue, Key Trustee KMS, HSM KMS.