Getting Started Upgrading a Cluster
Before you upgrade a cluster, you need to gather information, review the limitations and release notes and run some checks on the cluster. See the Collect Information section below. Fill in the My Environment form below to customize your CDH upgrade procedures.
The version of CDH or Cloudera Runtime that you can upgrade to depends on the version of Cloudera Manager that is managing the cluster. You may need to upgrade Cloudera Manager before upgrading your clusters Upgrades are not supported when using Cloudera Manager 7.0.3.
Minimum Required Role: Cluster Administrator (also provided by Full Administrator) This feature is not available when using Cloudera Manager to manage Data Hub clusters.
Loading Filters ... 7.1.3 7.1.2 7.1.1 7.0.3 6.3.3 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 5.16 5.15 5.14 5.13 5.12 5.11 5.10 5.9 5.8 5.7 5.6 5.5 5.4 5.3 5.2 5.1 5.0 6.3.3 6.3.2 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 5.16 5.15 5.14 5.13 5.12 5.11 5.10 5.9 5.8 5.7 5.6 5.5 5.4 5.3 5.2 5.1 5.0 7.1.3 7.1.2 7.1.1 7.0.3
Collect the following information about your environment and fill in the form above. This information will be remembered by your browser on all pages in this Upgrade Guide.
- Log in to the Cloudera Manager Server host.
- Run the following command to find the current version of the
- Log in to the Cloudera Manager Admin console and find the following:
- The version of Cloudera Manager used in your cluster. Go to .
- The version of the JDK deployed in the cluster. Go to .
- Whether High Availability is enabled for HDFS. Go to the HDFS service and click the Actions button. If you see Disable High Availability, the cluster has High Availability enabled.
- The Install Method and Current cluster version. The cluster (CDH) version number and Install Method are displayed on the Cloudera Manager Home page, to the right of the cluster name.
Preparing to Upgrade a Cluster
- You must have SSH access to the Cloudera Manager server hosts and be able to log in using the root account or an account that has password-less sudo permission to all the hosts.
- Review the Requirements and Supported Versions for the new versions you are upgrading to. See:Upgrading the Operating System. If your hosts require an operating system upgrade, you must perform the upgrade before upgrading the cluster. See
- Ensure that a supported version of Java is installed on all hosts in the cluster. See the links above. For installation instructions and recommendations, see Upgrading the JDK.
- Review the following documents:
- Review the following when upgrading to Cloudera Runtime 7.1 or higher:
- If your deployment has defined a Compute cluster and an associated Data Context, you will need to delete the Compute cluster and Data context before upgrading the base cluster and then recreate the Compute cluster and Data context after the upgrade.
- Review the upgrade procedure and reserve a maintenance window with enough time allotted to perform all steps. For production clusters, Cloudera recommends allocating up to a full day maintenance window to perform the upgrade, depending on the number of hosts, the amount of experience you have with Hadoop and Linux, and the particular hardware you are using.
- If you are upgrading from CDH 5.1 or lower, and use Hive Date partition columns, you might need to update the date format. See Date Partition Columns.
- If the cluster uses Impala, check your SQL against the newest reserved words listed in incompatible changes. If upgrading across multiple versions, or in case of any problems, check against the full list of Impala reserved words.
- If the cluster uses
Hive, validate the Hive Metastore Schema:
- In the Cloudera Manager Admin Console, Go to the Hive service.
- Select .
- Fix any reported errors.
- Select again to ensure that the schema is now valid.
- Run the Security Inspector and fix any reported errors.
- Log in to any cluster node as the
hdfsuser, run the following commands, and correct any reported errors:
hdfs fsck / -includeSnapshots
See HDFS Commands Guide in the Apache Hadoop documentation.
hdfs dfsadmin -report
- Log in to any DataNode as the
hdfsuser, run the following command, and correct any reported errors:
- If your cluster uses HBase, see Checking Apache HBase.
- If the cluster uses Kudu, log in to any cluster host and run the
ksckcommand as the
sudo -u kudu). If the cluster is Kerberized, first
kuduthen run the command:
kudu cluster ksck <master_addresses>
For the full syntax of this command, see Checking Cluster Health with
- If you have configured Hue to use TLS/SSL and you are upgrading from CDH 5.2 or lower to CDH 5.3 or higher, Hue validates CA certificates and requires a truststore. To create a truststore, follow the instructions in Configuring TLS/SXL for Hue.
- If you are upgrading to CDH 6.0 or higher, and Hue is deployed in the cluster, and Hue is using PostgreSQL as its database, you must manually install psycopg2. See Installing Dependencies for Hue.
- If your cluster uses the Flume Kafka client, and you are upgrading to CDH 5.8.0 or CDH 5.8.1, perform the extra steps described in Upgrading to CDH 5.8.0 or CDH 5.8.1 When Using the Flume Kafka Client and then continue with the procedures in this topic.
- If your cluster uses Impala and Llama, this role has been
deprecated as of CDH 5.9 and you must remove the role from the Impala
service before starting the upgrade. If you do not remove this role,
the upgrade wizard will halt the upgrade. To determine if Impala uses Llama:
To remove the Llama role:
- Go to the Impala service.
- Select the Instances tab.
- Examine the list of roles in the Role Type column. If Llama appears, the Impala service is using Llama.
- Go to the Impala service and select
The Disable YARN and Impala Integrated Resource Management wizard displays.
- Click Continue.
The Disable YARN and Impala Integrated Resource Management Command page displays the progress of the commands to disable the role.
- When the commands have completed, click Finish.
- If your cluster uses Sentry, and are upgrading from CDH 5.12 or lower, you might need to increase the Java heap memory for Sentry. See Before you install Sentry.
- If your cluster uses Sentry and an Oracle database, and you
are upgrading from CDH 5.13.0 or higher to CDH 5.16.0 or higher, or to CDH 6.1.0 or higher, you must
manually add the AUTHZ_PATH.AUTHZ_OBJ_ID index if it does not already
exist. Adding the index manually decreases the time Sentry takes to
get a full snapshot for HDFS sync. Use the following command to add
CREATE INDEX "AUTHZ_PATH_FK_IDX" ON "AUTHZ_PATH" ("AUTHZ_OBJ_ID");
- If your cluster uses the Ozone technical preview, you must stop and delete this service before upgrading the cluster.
- The following services
are no longer supported as of CDH 6.0.0:
- Sqoop 2
- MapReduce 1
- Record Service
- Open the Cloudera Manager Admin console and collect the following
information about your environment:
- The version of Cloudera Manager. Go to .
- The version of the JDK deployed. Go to .
- The version of CDH and whether the cluster was installed using parcels or packages. It is displayed next to the cluster name on the Home page.
- The services enabled in your cluster.
- Whether HDFS High Availability is enabled.
Go to Clusters click HDFS Service, click Actions menu. It is enabled if you see an menu item Disable High Availability.
- Back up Cloudera Manager before beginning the upgrade. See Backing Up Cloudera Manager.
- Review all CDH 6
pre-upgrade transition stepsCDP Private Cloud Base Pre-upgrade transition steps. There are steps you must perform before
beginning the upgrade for the following components:
- MapReduce 1
- YARN Fair Scheduler is replaced by the Capacity Scheduler
- Cloudera Search
- Sentry is replaced with Apache Ranger
- Cloudera Navigator is replaced by Apache Atlas
- Cloudera Search
- MapReduce 1
- YARN Fair Scheduler
- Sentry Policy Files must be transitioned to the Sentry Service
- Key Trustee KMS
- HSM KMS