Upgrading the Cluster

The version of CDH or Cloudera Runtime that you can upgrade to depends on the version of Cloudera Manager that is managing the cluster. You may need to upgrade Cloudera Manager before upgrading your clusters Upgrades are not supported when using Cloudera Manager 7.0.3.

Minimum Required Role: Cluster Administrator (also provided by Full Administrator) This feature is not available when using Cloudera Manager to manage Data Hub clusters.

Loading Filters ... 7.1.3 7.1.2 7.1.1 7.0.3 6.3.3 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 5.16 5.15 5.14 5.13 5.12 5.11 5.10 5.9 5.8 5.7 5.6 5.5 5.4 5.3 5.2 5.1 5.0 6.3.3 6.3.2 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 5.16 5.15 5.14 5.13 5.12 5.11 5.10 5.9 5.8 5.7 5.6 5.5 5.4 5.3 5.2 5.1 5.0 7.1.3 7.1.2 7.1.1 7.0.3

After you complete the steps to prepare your CDH upgrade and backup CDH components, continue with the following upgrade steps:

Review Notes and Warnings

Note the following before upgrading your clusters:

Back Up Cloudera Manager

Before you upgrade a cluster, back up Cloudera Manager. Even if you just backed up Cloudera Manager before an upgrade, you should now back up your upgraded Cloudera Manager deployment. See Backing Up Cloudera Manager.

Enter Maintenance Mode

To avoid unnecessary alerts during the upgrade process, enter maintenance mode on your cluster before you start the upgrade. Entering maintenance mode stops email alerts and SNMP traps from being sent, but does not stop checks and configuration validations. Be sure to exit maintenance mode when you have finished the upgrade to re-enable Cloudera Manager alerts. More Information.

On the Home > Status tab, click the actions menu next to the cluster name and select Enter Maintenance Mode.

Complete Pre-Upgrade steps for upgrades to CDP Private Cloud Base

Ensure that you have completed the following steps when upgrading from CDH 5.x to CDP Private Cloud Base 7.1.

  • Cloudera Search – See Transitioning Cloudera Search Configuration Before Upgrading to Cloudera Runtime.
  • Flume – Flume is not supported in CDP Private Cloud Base. You must remove the Flume service before upgrading to CDP Private Cloud Base.
  • HBase – See Checking Apache HBase.
  • Hive – See Migrating Hive 1-2 to Hive 3
  • Kafka In CDH 5.x, Kafka was delivered as a separate parcel and could be installed along with CDH 5.x using Cloudera Manager. In Runtime 7.0.3 and later, Kafka is part of the Cloudera Runtime distribution and is deployed as part of the Cloudera Runtime parcels. To successfully upgrade Kafka you need to set the protocol version to match what's being used currently among the brokers and clients.
    1. Explicitly set the Kafka protocol version to match what's being used currently among the brokers and clients. Update server.properties on all brokers as follows:
      1. Log in to the Cloudera Manager Admin Console
      2. Choose the Kafka service.
      3. Click Configuration.
      4. Use the Search field to find the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties configuration property.
      5. Add the following properties to the snippet:
        • inter.broker.protocol.version = current_Kafka_version
        • log.message.format.version = current_Kafka_version
        Replace current_Kafka_version with the version of Apache Kafka currently being used. See the Product Compatibility Matrix for CDK Powered By Apache Kafka to find out which upstream version is used by which version of CDK. Make sure you enter full Apache Kafka version numbers with three values, such as 0.10.0. Otherwise, you will see an error message similar to the following:
        2018-06-14 14:25:47,818 FATAL kafka.Kafka$:
        java.lang.IllegalArgumentException: Version `0.10` is not a valid version
                at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72)
                at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72)
                at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
    2. Save your changes. The information is automatically copied to each broker.
  • MapReduce – See Transitioning from MapReduce 1 to MapReduce 2.
  • Navigator –- See Transitioning Navigator content to Atlas
  • Replication Schedules – See CDH cluster upgrade requirements for Replication Manager.
  • Sentry The Sentry service has been replace with Apache Ranger in Cloudera Runtime 7.1. You must perform several steps before upgrading your cluster. See Transitioning the Sentry service to Apache Ranger.
  • Virtual Private Clusters:

    If your deployment has defined a Compute cluster and an associated Data Context, you will need to delete the Compute cluster and Data context before upgrading the base cluster and then recreate the Compute cluster and Data context after the upgrade.

  • YARN : Decommission and recommission the YARN NodeManagers but do not start the NodeManagers. A decommission is required so that the NodeManagers stop accepting new containers, kill any running containers, and then shutdown.
    1. Ensure that new applications, such as MapReduce or Spark applications, will not be submitted to the cluster until the upgrade is complete.
    2. In the Cloudera Manager Admin Console, navigate to the YARN service for the cluster you are upgrading.
    3. On the Instances tab, select all the NodeManager roles. This can be done by filtering for the roles under Role Type.
    4. Click Actions for Selected (number) > Decommission.

      If the cluster runs CDH 5.9 or higher and is managed by Cloudera Manager 5.9 or higher, and you configured graceful decommission, the countdown for the timeout starts.

      A Graceful Decommission provides a timeout before starting the decommission process. The timeout creates a window of time to drain already running workloads from the system and allow them to run to completion. Search for the Node Manager Graceful Decommission Timeout field on the Configuration tab for the YARN service, and set the property to a value greater than 0 to create a timeout.

    5. Wait for the decommissioning to complete. The NodeManager State is Stopped and the Commission State is Decommissioned when decommissioning completes for each NodeManager.
    6. With all the NodeManagers still selected, click Actions for Selected (number) > Recommission.
  • YARN Fair Scheduler – See Fair Scheduler to Capacity Scheduler transition.

Complete Pre-Upgrade steps for CDH 5 to CDH 6 upgrades

Ensure that you have completed the following steps when upgrading from CDH 5.x to CDH 6.x.

  • YARN

    Decommission and recommission the YARN NodeManagers but do not start the NodeManagers.

    A decommission is required so that the NodeManagers stop accepting new containers, kill any running containers, and then shutdown.

    1. Ensure that new applications, such as MapReduce or Spark applications, will not be submitted to the cluster until the upgrade is complete.
    2. In the Cloudera Manager Admin Console, navigate to the YARN service for the cluster you are upgrading.
    3. On the Instances tab, select all the NodeManager roles. This can be done by filtering for the roles under Role Type.
    4. Click Actions for Selected (number) > Decommission.

      If the cluster runs CDH 5.9 or higher and is managed by Cloudera Manager 5.9 or higher, and you configured graceful decommission, the countdown for the timeout starts.

      A Graceful Decommission provides a timeout before starting the decommission process. The timeout creates a window of time to drain already running workloads from the system and allow them to run to completion. Search for the Node Manager Graceful Decommission Timeout field on the Configuration tab for the YARN service, and set the property to a value greater than 0 to create a timeout.

    5. Wait for the decommissioning to complete. The NodeManager State is Stopped and the Commission State is Decommissioned when decommissioning completes for each NodeManager.
    6. With all the NodeManagers still selected, click Actions for Selected (number) > Recommission.
  • Hive

    There are changes to query syntax, DDL syntax, and the Hive API. You might need to edit the HiveQL code in your application workloads before upgrading.

    See Incompatible Changes for Apache Hive/Hive on Spark/HCatalog.

  • Pig

    DataFu is no longer supported. Your Pig scripts will require modification for use with CDH 6.x.

    See Incompatible Changes for Apache Pig.

  • Sentry

    If your cluster uses Sentry policy file authorization, you must transition the policy files to the database-backed Sentry service before you upgrade to CDH 6 or CDP Private Cloud Base 7.1.

    See Transitioning from Sentry Policy Files to the Sentry Service.
  • Cloudera Search

    If your cluster uses Cloudera Search, you must migrate the configuration to Apache Solr 7.

    See Transitioning Cloudera Search Configuration Before Upgrading to CDH 6.

  • Spark

    If your cluster uses Spark or Spark Standalone, there are several steps you must perform to ensure that the correct version is installed.

    See Migrating Apache Spark Before Upgrading to CDH 6.

  • Kafka
    In CDH 5.x, Kafka was delivered as a separate parcel and could be installed along with CDH 5.x using Cloudera Manager. Starting with CDH 6.0, Kafka is part of the CDH distribution and is deployed as part of the CDH 6.x parcel.
    1. Explicitly set the Kafka protocol version to match what's being used currently among the brokers and clients. Update server.properties on all brokers as follows:
      1. Log in to the Cloudera Manager Admin Console
      2. Choose the Kafka service.
      3. Click Configuration.
      4. Use the Search field to find the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties configuration property.
      5. Add the following properties to the snippet:
        • inter.broker.protocol.version = current_Kafka_version
        • log.message.format.version = current_Kafka_version
        Replace current_Kafka_version with the version of Apache Kafka currently being used. See the Product Compatibility Matrix for CDK Powered By Apache Kafka to find out which upstream version is used by which version of CDK. Make sure you enter full Apache Kafka version numbers with three values, such as 0.10.0. Otherwise, you will see an error message similar to the following:
        2018-06-14 14:25:47,818 FATAL kafka.Kafka$:
        java.lang.IllegalArgumentException: Version `0.10` is not a valid version
                at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72)
                at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72)
                at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
    2. Save your changes. The information is automatically copied to each broker.
  • HBase

    See Checking Apache HBase.

  • Hue

    See Installing Dependencies for Hue.

  • Key Trustee KMS

    See Pre-Upgrade Transition Steps for Upgrading Key Trustee KMS to CDH 6.

  • HSM KMS

    See Pre-Upgrade Transition Steps for Upgrading HSM KMS to CDH 6.

Establish Access to the Software

When you upgrade CDH using packages, you can choose to access the Cloudera public repositories directly, or you can download those repositories and set up a local package repository to access them from within your network. If your cluster hosts do not have connectivity to the Internet, you must set up a local repository.

  1. Run the following commands on all cluster hosts to backup the repository directories and remove older files:
    RHEL / CentOS
    sudo cp -rpf
                        /etc/yum.repos.d $HOME/yum.repos.d-`date
                          +%F`-CM-CDH
    sudo rm /etc/yum.repos.d/cloudera*cdh.repo*
    SLES
    sudo cp -rpf /etc/zypp/repos.d $HOME/repos.d-`date +%F`CM-CDH
    
    sudo rm /etc/zypp/repos.d/cloudera*cdh.repo*
    Debian / Ubuntu
    sudo cp -rpf /etc/apt/sources.list.d $HOME/sources.list.d-`date +%F`-CM-CDH
    sudo rm /etc/apt/sources.list.d/cloudera*cdh.list*
  2. On all cluster hosts, do one of the following, depending on whether or not you are using a local package repository:
    • Using a local package repository. (Required when cluster hosts do not have access to the internet.)

      1. Configure a local package repository hosted on your network.
      2. In the Package Repository URL field below, replace the entire URL with the URL for your local package repository. A username and password are not required to access local repositories.
      3. Click Apply.
    • Using the Cloudera public repository

      1. In the Package Repository URL field below, substitute your USERNAME and PASSWORD where indicated in the URL.
      2. Click Apply

      Package Repository URL:

  3. Create a file named /etc/yum.repos.d/cloudera-cdh.repo with the following content:Create a file named /etc/zypp/repos.d/cloudera-cdh.repo with the following content:Debian is not supported for CDH 6.x. Please select a supported operating system to continue.Create a file named /etc/apt/sources.list.d/cloudera-cdh.list with the following content:
    [cloudera-cdh]
    # Packages for Cloudera CDH
    name=Cloudera CDH
    baseurl=https://username:password@archive.cloudera.com/p/cdh6/CDH version/operating systemOS version/yum/ 
    gpgkey=https://username:password@archive.cloudera.com/p/cdh6/CDH version/operating systemOS version/yum/RPM-GPG-KEY-cloudera
    gpgcheck=1
    # Packages for Cloudera CDH
    deb https://username:password@archive.cloudera.com/p/cdh6/CDH version/ubuntuOS version/apt/ bionic-cdhCDH version contrib
    deb-src https://username:password@archive.cloudera.com/p/cdh6/CDH version/ubuntuOS version/apt/ bionic-cdhCDH version contrib
    RHEL / CentOS

    Create a file named /etc/yum.repos.d/cloudera-cdh.repo with the following content:

    [cdh]
    # Packages for CDH
    name=CDH
    baseurl=https://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.15
    gpgkey=https://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/RPM-GPG-KEY-cloudera
    gpgcheck=1
    SLES

    Create a file named /etc/zypp/repos.d/cloudera-cdh.repo with the following content:

    [cdh]
    # Packages for CDH
    name=CDH
    baseurl=https://archive.cloudera.com/cdh5/sles/12/x86_64/cdh/5.15
    gpgkey=https://archive.cloudera.com/cdh5/sles/12/x86_64/cm/RPM-GPG-KEY-cloudera
    gpgcheck=1
    Debian / Ubuntu

    Create a file named /etc/apt/sources.list.d/cloudera-cdh.list with the following content:

    # Packages for CDH
    deb https://archive.cloudera.com/cdh5/debian/jessie/amd64/cdh/ jessie-cdh5.15 contrib
    deb-src https://archive.cloudera.com/cdh5/debian/jessie/amd64/cdh/ jessie-cdh5.15 contrib
  4. Copy the file to all cluster hosts.

Run Hue Document Cleanup

If your cluster uses Hue, perform the following steps (not required for maintenance releases). These steps clean up the database tables used by Hue and can help improve performance after an upgrade.
  1. Back up your database before starting the cleanup activity.
  2. Check the saved documents such as Queries and Workflows for a few users to prevent data loss.
  3. Connect to the Hue database. See Hue Custom Databases in the Hue component guide for information about connecting to your Hue database.
  4. Check the size of the desktop_document, desktop_document2, oozie_job, beeswax_session, beeswax_savedquery and beeswax_queryhistory tables to have a reference point. If any of these have more than 100k rows, run the cleanup.
    select count(*) from desktop_document;
    select count(*) from desktop_document2;
    select count(*) from beeswax_session;
    select count(*) from beeswax_savedquery;
    select count(*) from beeswax_queryhistory;
    select count(*) from oozie_job;
  5. SSH in to an active Hue instance.
  6. Change to the Hue home directory:
    cd /opt/cloudera/parcels/CDH/lib/hue
  7. Run the following command as the root user:
    DESKTOP_DEBUG=True ./build/env/bin/hue desktop_document_cleanup --keep-days x

    The --keep-days property is used to specify the number of days for which Hue will retain the data in the backend database.

    For example:
    DESKTOP_DEBUG=True ./build/env/bin/hue desktop_document_cleanup --keep-days 90

    In this case, Hue will retain data for the last 90 days.

    The logs are displayed on the console because DESKTOP_DEBUG is set to True. Alternatively, you can view the logs from the following location:

    /var/log/hue/desktop_document_cleanup.log

    The first run can typically take around 1 minute per 1000 entries in each table, but can take much longer depending on the size of the tables.

  8. Check the size of the desktop_document, desktop_document2, oozie_job, beeswax_session, beeswax_savedquery and beeswax_queryhistory tables and confirm they are now smaller.
    select count(*) from desktop_document;
    select count(*) from desktop_document2;
    select count(*) from beeswax_session;
    select count(*) from beeswax_savedquery;
    select count(*) from beeswax_queryhistory;
    select count(*) from oozie_job;
  9. If any of the tables are still above 100k in size, run the command again while specifying less number of days this time. For example, 60 or 30.

Check Oracle Database Initialization

If your cluster uses Oracle for any databases, before upgrading from CDH 5 check the value of the COMPATIBLE initialization parameter in the Oracle Database using the following SQL query: 
SELECT name, value FROM v$parameter WHERE name = 'compatible'
The default value is 12.2.0. If the parameter has a different value, you can set it to the default as shown in the Oracle Database Upgrade Guide.

Stop the Cluster

Stop the cluster before proceeding to upgrade CDH using packages:
  1. Open the Cloudera Manager Admin Console.
  2. Click the drop-down list next to the cluster name and select Stop.

Install CDH Packages

  1. Log in to each host in the cluster using ssh.
  2. Run the following command:
    RHEL / CentOS
    sudo yum clean all
    sudo yum install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie parquet pig pig-udf-datafu search sentry solr solr-mapreduce spark-python sqoop sqoop2 whirr zookeeper
    sudo yum clean all
    sudo yum remove hadoop-0.20\* hue-\* crunch llama mahout sqoop2 whirr sqoop2-client
    sudo yum install avro-tools bigtop-jsvc bigtop-utils flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue impala impala-shell kafka kite keytrustee-keyprovider kudu oozie parquet parquet-format pig search sentry sentry-hdfs-plugin solr solr-crunch solr-mapreduce spark-core spark-python sqoop zookeeper
    SLES
    sudo zypper clean --all
    sudo zypper install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie parquet pig pig-udf-datafu search sentry solr solr-mapreduce spark-python sqoop sqoop2 whirr zookeeper
    sudo zypper clean --all
    sudo zypper remove hadoop-0.20\* hue-\* crunch llama mahout sqoop2 whirr sqoop2-client
    sudo zypper install avro-tools bigtop-jsvc bigtop-utils flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue impala impala-shell kafka kite keytrustee-keyprovider kudu oozie parquet parquet-format pig search sentry sentry-hdfs-plugin solr solr-crunch solr-mapreduce spark-core spark-python sqoop zookeeper
    Debian / Ubuntu
    sudo apt-get update
    sudo apt-get install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie parquet pig pig-udf-datafu search sentry solr solr-mapreduce spark-python sqoop sqoop2 whirr zookeeper
    sudo apt-get update
    sudo apt-get remove hadoop-0.20\* crunch llama mahout sqoop2 whirr sqoop2-client
    sudo apt-get update
    sudo apt-get install avro-tools bigtop-jsvc bigtop-utils flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase hbase-solr hive-hbase hive-webhcat hue impala impala-shell kafka kite keytrustee-keyprovider kudu oozie parquet parquet-format pig search sentry sentry-hdfs-plugin solr solr-crunch solr-mapreduce spark-core spark-python sqoop zookeeper
  3. Restart the Cloudera Manager Agent.
    RHEL 7, SLES 12, Debian 8, Ubuntu 16.04 and higher
    sudo systemctl restart cloudera-scm-agent
    If the agent starts without errors, no response displays.
    RHEL 5 or 6, SLES 11, Debian 6 or 7, Ubuntu 12.04 or 14.04
    sudo service cloudera-scm-agent restart
    You should see the following:
    Starting cloudera-scm-agent: [ OK ]

Access Parcels

Parcels contain the software used in your CDP Private Cloud Base clusters. If Cloudera Manager has access to the public Internet, Cloudera Manager automatically provides access to the latest version of the Cloudera Runtime 7 Parcels directly from the Cloudera download site.

If Cloudera Manager does not have access to the internet, you must download the Parcels and set up a local Parcel repository. See Configuring a Local Parcel Repository. Enter the URL of your repository using the steps below.

If you want to upgrade to a different version of Cloudera Runtime 7, select the cluster version at the top of this page, and follow the steps below to add the following Parcel URL:
archive.cloudera.com/p/cdh7/7.1.2.0/parcels/
To add a new Parcel URL:
  1. Log in to the Cloudera Manager Admin Console.
  2. Click Parcels from the left menu.
  3. Click Parcel Repositories & Network Settings.
  4. In the Remote Parcel Repository URLs section, click the "+" icon and add the URL for your Parcel repository.
  5. Click Save & Verify Configuration. A message with the status of the verification appears above the Remote Parcel Repository URLs section. If the URL is not valid, check the URL and enter the correct URL.
  6. After the URL is verified, click Close.
  7. Click the Cloudera Manager logo to return to the home page.

Download and Distribute Parcels

  1. Log in to the Cloudera Manager Admin Console.
  2. Click Hosts > Parcels. The Parcels page displays.
  3. Update the Parcel Repository for CDH using the following remote parcel repository URL:
    https://archive.cloudera.com/cdh5/parcels/5.15/
    https://username:password@archive.cloudera.com/p/cdh6/version/parcels/
    https://archive.cloudera.com/p/cdh6/version/parcels/
    1. Click the Configuration button.
    2. In the Remote Parcel Repository URLs section, Click the + icon to add the parcel URL above. Click Save Changes. See Parcel Conifguration Settings for more information.
    3. Locate the row in the table that contains the new CDH parcel and click the Download button. If the parcel does not appear on the Parcels page, ensure that the Parcel URL you entered is correct.
    4. After the parcel is downloaded, click the Distribute button.
  4. If your cluster has GPLEXTRAS installed, update the version of the GPLEXTRAS parcel to match the CDH version using the following remote parcel repository URL:
    https://archive.cloudera.com/gplextras5/parcels/5.15/
    https://archive.cloudera.com/p/gplextras6/version/parcels/
    1. Click the Configuration button.
    2. In the Remote Parcel Repository URLs section, Click the + icon to add the parcel URL above. Click Save Changes.
    3. Locate the row in the table that contains the new CDH parcel and click the Download button. If the parcel does not appear on the Parcels page, ensure that the Parcel URL you entered is correct.
    4. After the parcel is downloaded, click the Distribute button.
  5. If your cluster has Spark 2.0, Spark 2.1, or Spark 2.2 installed, and you want to upgrade to CDH 5.13 or higher, you must download and install Spark 2.1 release 2, Spark 2.2 release 2, or a higher version.

    To install these versions of Spark, do the following before running the CDH Upgrade Wizard:
    1. Install the Custom Service Descriptor (CSD) file. See
    2. Download, distribute, and activate the Parcel for the version of Spark that you are installing: See Parcel Conifguration Settings.
  6. If your cluster has Kudu 1.4.0 or lower installed and you want to upgrade to CDH 5.13 or higher, deactivate the existing Kudu parcel. Starting with Kudu 1.5.0 / CDH 5.13, Kudu is part of the CDH parcel and does not need to be installed separately.
  7. After all the parcels are distributed, click on the Upgrade button next to the chosen CDH. The chosen CDH should be selected automatically.

Configure Streams Messaging Manager

If your cluster uses Streams Messaging Manager, you need to update database related configuration properties and configure the streamsmsgmgr user’s home directory. In addition, if you are using MySQL to store Streams Messaging Manager metadata, you also need to download the JDBC Driver for MySQL (Connector/J) to Streams Messaging Manager hosts.

  1. Stop the Streams Messaging Manager Service:
    1. In Cloudera Manager, select the Streams Messaging Manager service.
    2. Click Actions > Stop.
    3. Click Stop on the next screen to confirm.

      When you see a Finished status, the service has stopped.

    4. Click Close.
  2. Configure database related properties:
    1. In Cloudera Manager, select the Streams Messaging Manager service.
    2. Go to Configuration.
    3. Find and configure the following properties:
      • Streams Messaging Manager Database User Password
      • Streams Messaging Manager Database Type
      • Streams Messaging Manager Database Name
      • Streams Messaging Manager Database User
      • Streams Messaging Manager Database Host
      • Streams Messaging Manager Database Port
    4. Click Save Changes.
  3. Change the streamsmsgmgr user’s home directory:
    1. Log in to the Streams Messaging Manager host.
      ssh [MY_STREAMS_MESSAGING_MANAGER_HOST]
    2. Change the streamsmsgmgr user’s home directory to /var/lib/streams_messaging_manager.
      Rhel-compatible:
      usermod -d /var/lib/streams_messaging_manager -m streamsmsgmgr
  4. Download the JDBC Driver for MySQL (Connector/J) to the Streams Messaging Manager host and make it available in the required locations:
    1. Download the JDBC Driver for MySQL (Connector/J) from the MySQL Product Archives.
      Cloudera recommends that you use version 5.1.46. Examples in the following steps assume that you downloaded version 5.1.46. Make sure that you download or copy the JDBC Driver for MySQL (Connector/J) archive to the host that Streams Messaging Manager is deployed on.
      • If your cluster has internet access, download the archive directly to the host.
        wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-5.1.46.tar.gz
      • If internet access is not available, download it on a machine that has access and then copy it over to your host.
    2. Extract the archive.
      Use the tar command or any other archive manager to extract the archive.
      tar -xzvf [ARCHIVE_PATH]
      Replace [ARCHIVE_PATH] with the path to the archive you have downloaded. For example, /root/mysql-connector-java-5.1.46.tar.gz.
    3. Copy the mysql-connector-java-5.1.46-bin.jar JAR file from the extracted archive to the parcel directory.
      cp [MYSQL_CONNECTOR_JAR] /opt/cloudera/parcels/CDH-[VERSION_NUMBER]/jars

      Replace [MYSQL_CONNECTOR_JAR] with the path to the connector JAR file. You can find the JAR file within the directory you extracted in the previous step. For example, /root/mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar. Replace [VERSION_NUMBER] with the version number of the parcel you are upgrading to.

    4. Create symlinks to make the connector available in the required locations by running the following commands.
      cd /opt/cloudera/parcels/CDH-[VERSION_NUMBER]/lib/streams_messaging_manager/bootstrap/lib
      ln -s ../../../../jars/mysql-connector-java-5.1.46-bin.jar 
      cd /opt/cloudera/parcels/CDH-[VERSION_NUMBER]/lib/streams_messaging_manager/libs
      ln -s ../../../jars/mysql-connector-java-5.1.46-bin.jar 

Configure Schema Registry

If your cluster uses Schema Registry, you need to update database related configuration properties. In addition, if you are using MySQL to store Schema Registry metadata, you also need to download the JDBC Driver for MySQL (Connector/J) to Schema Registry hosts.

  1. Configure database related properties:
    1. In Cloudera Manager, select the Schema Registry service.
    2. Go to Configuration.
    3. Find and configure the following properties:
      • Schema Registry Database User Password
      • Schema Registry Database Type
      • Schema Registry Database Name
      • Schema Registry Database User
      • Schema Registry Database Host
      • Schema Registry Database Port
    4. Click Save Changes.
  2. Download the JDBC Driver for MySQL (Connector/J) to the Schema Registry host and make it available in the required locations:
    1. Log in to the Schema Registry host.
      ssh [MY_SCHEMA_REGISTRY_HOST]
    2. Download the JDBC Driver for MySQL (Connector/J) from the MySQL Product Archives.
      Cloudera recommends that you use version 5.1.46. Examples in the following steps assume that you downloaded version 5.1.46. Make sure that you download or copy the JDBC Driver for MySQL (Connector/J) archive to the host that Schema Registry is deployed on.
      • If your cluster has internet access, download the archive directly to the host.
        wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-java-5.1.46.tar.gz
      • If internet access is not available, download it on a machine that has access and then copy it over to your host.
    3. Extract the archive.
      Use the tar command or any other archive manager to extract the archive.
      tar -xzvf [ARCHIVE_PATH]
      Replace [ARCHIVE_PATH] with the path to the archive you have downloaded. For example, /root/mysql-connector-java-5.1.46.tar.gz.
    4. Copy the mysql-connector-java-5.1.46-bin.jar JAR file from the extracted archive to the parcel directory.
      cp [MYSQL_CONNECTOR_JAR] /opt/cloudera/parcels/CDH-[VERSION_NUMBER]/jars

      Replace [MYSQL_CONNECTOR_JAR] with the path to the connector JAR file. You can find the JAR file within the directory you extracted in the previous step. For example /root/mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar. Replace [VERSION_NUMBER] with the version number of the parcel you are upgrading to.

    5. Create symlinks to make the connector available in the required locations by running the following commands.
      cd /opt/cloudera/parcels/CDH-[VERSION_NUMBER]/lib/schemaregistry/bootstrap/lib
      ln -s ../../../../jars/mysql-connector-java-5.1.46-bin.jar 
      cd /opt/cloudera/parcels/CDH-[VERSION_NUMBER]/lib/schemaregistry/libs
      ln -s ../../../jars/mysql-connector-java-5.1.46-bin.jar 

Run the Upgrade Cluster Wizard

  1. Log in to the Cloudera Manager Admin Console.
  2. Click the Actions menu and select Upgrade Cluster.

    The Getting Started screen of the Upgrade Wizard displays.

  3. Click the Upgrade to Version: drop-down and select the version of CDH or Cloudera Runtime for your upgrade.

    The wizard now runs several checks to make sure that your cluster is ready for upgrade. You must resolve any reported issues before continuing.

  4. Click the Download and Distribute Parcel button.
    The parcel downloads from the Remote Repository URL you specified. After it is downloaded, Cloudera Manager distributes the parcel to all of the cluster hosts and unpacks it. Depending on network bandwidth, this process may take some time.

    The Install Services section displays any additional services that you need to install to upgrade your cluster.

    If you are upgrading a cluster that has the Hive service, you will be prompted to add the Tez, Zookeeper, Hive on Tez, and YARN QueueManager services.
  5. The Sentry service is replaced by Apache Ranger in CDP Private Cloud Base. If the cluster has the Sentry service installed, you can migrate to Apache Ranger.

    The Apache Ranger service depends on the ZooKeeper and Solr services. The upgrade wizard display buttons for installing several dependent services that are required for Apache Ranger. If your cluster does not include these services, buttons will appear to install them.

    1. Follow the steps for Transitioning the Sentry service to Apache Ranger before continuing.
    2. If the cluster does not already have the ZooKeeper service, click the Add ZooKeeper Service button.

      The Assign Roles page displays with the role assignment for the ZooKeeper service. You can keep the assigned host or assign the role to a different host.

    3. Click Continue.

      The Review Changes screen displays where you can change the default configurations.

    4. Click Continue.

      The upgrade wizard resumes.

    5. If the cluster does not already have the Solr service, click the Add Solr Service button.

      The Assign Roles page displays with the role assignment for the Solr service. You can keep the assigned host or assign the role to a different host.

    6. Click Continue.

      The Review Changes screen displays where you can change the default configurations.

    7. Click Continue.

      The upgrade wizard resumes.

    8. Click the Add Ranger Service button

      The Assign Roles page displays with the role assignment for the Ranger service.

    9. Assign the following Ranger roles to cluster hosts:
      • Ranger Admin -- you must assign this role to the host you specified when you set up the Ranger database.
      • Ranger Usersync
      • Ranger Tagsync
    10. The Ranger Review Changes screen displays. Review the configurations and make any necessary changes. You must provide values for the following:
      • Ranger Admin User Initial Password – choose a password.
      • Ranger Usersync User Initial Password – choose a password.
      • Ranger Tagsync User Initial Password – choose a password.
      • Ranger KMS Keyadmin user initial Password – choose a password.
      • Ranger Database Type Chose either MySQL, PostgreSQL, or Oracle.
      • Ranger Database Host – enter the hostname where the Ranger database is running.
      • Ranger Database User Password – enter the password you created when you created the Ranger database and user. rangeradmin.
      • Ranger Admin Max Heapsize – set the default value instead of minimum value by clicking the curved blue arrow.
      • Ranger Tagsync Max Heapsize – set the default value instead of minimum value by clicking the curved blue arrow.
      • Ranger Usersync Max Heapsize – set the default value instead of minimum value by clicking the curved blue arrow.
    11. Click Continue.
  6. If your cluster does not have the YARN Queue Manager, installed, a button will appear to add the YARN Queue Manager service because it is required for the Capacity Scheduler, which is the supported scheduler.

    For more information about how to migrate from Fair Scheduler to Capacity Scheduler, see Fair Scheduler to Capacity Scheduler transition.

  7. Enable Atlas install.

    If the CDH cluster being upgraded was running Navigator, the upgrade wizard shows a note recommending that you enable Atlas in the new cluster. Check the Install Atlas option.



  8. Install Atlas dependencies.

    The wizard steps through the installation for Atlas' dependencies, assuming these services haven't already been included in the installation:

    • ZooKeeper. Assign one or more hosts for the ZooKeeper role.
    • HDFS. Already included in the installation.
    • Kafka. Select the optional dependency of HDFS. Atlas requires configuring the Broker service only, not MirrorMaker, Connect, or Gateway.
    • HBase. Atlas requires configuring HBase Master and RegionServers only, not REST or Thrift Server. Assign a Master role on at least one host. Assign RegionServers to all hosts.
    • Solr. Assign a host for the Solr Server role. Set the Java Heap Size of Solr Server in Bytes property to 12 GB (to support the migration operation).

    For recommendations on where in the cluster to install the service roles, see Runtime Cluster Hosts and Role Assignments.

  9. Click Add Atlas Service. The wizard steps through choosing a host and setting migration details.
    • Set the host for the Atlas server roles and click Continue.
    • The Atlas Migrate Navigator Data screen displays.
      This screen contains migration commands that are customized to your environment. When you fill in the output file paths, the command text changes to incorporate your settings.
      1. Set migration data-staging locations.

        The migration process creates two data files on the local file system on the host where Atlas is installed. Make sure there is enough disk space to hold these files; see Estimating the time and resources needed for transition.

      2. Copy the extraction command text to an editor.


      3. Copy the transformation command text to an editor.


      4. Confirm the output file location. This is the location where Atlas will look for the content to import. Make sure it matches the location you plan to use for the output of the transformation command.
      5. Click Continue.
    • The Atlas Enable Migration Mode screen displays. Review the Atlas Safety Valve content and click Continue.

      After the migration is complete, you will manually remove these settings to start Atlas in normal operation.

    • The Atlas Review Changes screen displays. Review the configurations and make any necessary changes.You must provide a value for the following:
      • Admin Password – choose a password for the preconfigured admin user.
      • Atlas Max Heapsize – set the max heapsize to the default value by clicking the curved blue arrow. If you plan to migrate content from Cloudera Navigator to Atlas, consider setting the heapsize to 16 GB.


    • Click Continue.

    To complete the Navigator-to-Atlas migration outside of the CDP Runtime upgrade, see Transitioning Navigator data using customized scripts.

  10. The Other Tasks section lists other tasks or reminders to note before continuing. Select the option to confirm that you understand before continuing.
  11. The Inspector Checks section displays sever inspectors you must run before continuing. If these inspectors report errors, you must resolve those before continuing.
    • Click the Show Inspector Results button to see details of the inspection.
    • Click the Run Again button to verify that you have resolved the issue.
    • If you are confident that the errors are not critical, select Skip this step. I understand the risks..
    The Inspector Checks section includes the following inspectors:
    • Host Inspector
    • Service Inspector

    Run these inspectors and correct any reported errors before continuing.

  12. The Database Backup section asks you to verify that you have completed the necessary backups. Select Yes, I have performed these steps.
  13. Click Continue. (The Continue button remains greyed out until all upgrades steps are complete and all warnings have been acknowledged.)
  14. Click Continue again to shut down the cluster and begin the upgrade.

    The Upgrade Cluster Command screen opens and displays the progress of the upgrade.

  15. When the Upgrade steps are complete, click Continue.

    The Summary page opens and displays any additional steps you need to complete the upgrade.

  16. Click Continue.
  1. If you are using packages, or did not choose Upgrade from the parcels page, You can get to the Upgrade CDH page from the Home > Status tab, click next to the cluster name and select Upgrade Cluster.
    Select the previously download/distributed CDH version. If no qualifying CDH parcels are pre-listed, or you want to upgrade to a different version of CDH:
    1. Click the Remote Parcel Repository URLs link and add the appropriate parcel URL. See Parcel Configuration Settings for more information.
    2. Click the Cloudera Manager logo to return to the Home page.
    3. From the Home > Status tab, click next to the cluster name and select Upgrade Cluster.

    If you were previously using packages and would like to switch to using parcels, select Use Parcels.

  2. Cloudera Manager 5.14 and lower:
    1. In the Choose CDH Version (Parcels) section, select the CDH version that you want to upgrade to.
    2. Click Continue.

      A page displays the version you are upgrading to and asks you to confirm that you have completed some additional steps.

    3. Click Yes, I have performed these steps.
    4. Click Continue.
    5. Cloudera Manager verifies that the agents are responsive and that the correct software is installed. When you see the No Errors Found message, click Continue.

      The selected parcels are downloaded, distributed, and unpacked.

    6. Click Continue.

      The Host Inspector runs. Examine the output and correct any reported errors.

    Cloudera Manager 5.15 and higher:
    1. In the Upgrade to CDH Version drop-down list, select the version of CDH you want to upgrade to.

      The Upgrade Wizard performs some checks on configurations, health, and compatibility and reports the results. Fix any reported issues before continuing.

    2. Click Run Host Inspector.

      The Host Inspector runs. Click Show Inspector Results to view the Host Inspector report (opens in a new browser tab). Fix any reported issues before continuing.

    3. Click Run Service Inspector. Click Show Inspector Results to view the output of the Service Inspector command (opens in a new browser tab). Fix any reported issues before continuing.
    4. Read the notices for steps you must complete before upgrading, select Yes, I have performed theses steps. ... after completing the steps, and click Continue.

      The selected parcels are downloaded, distributed, and unpacked. The Continue button turns blue when this process finishes.

  3. If you have a parcel that works with the existing CDH version, the Upgrade Wizard may display a message that this parcel conflicts with the new CDH version.
    1. Configure and download the newer version of this parcel before proceeding.
      1. Open the Cloudera Manager Admin Console from another browser tab, go to the parcels page, and configure the remote parcel repository for the newer version of this parcel.
      2. Download and distribute the newer version of this parcel.
    2. Click the Run All Checks Again button.
    3. Select the option to resolve the conflicts automatically.
    4. Cloudera Manager deactivates the old version of the parcel, activates the new version and verifies that all hosts have the correct software installed.
  4. Click Continue.

    The Choose Upgrade Procedure screen displays. Select the upgrade procedure from the following options:

    • Rolling Restart

      Cloudera Manager upgrades services and performs a rolling restart. The Rolling Restart dialog box displays the impact of the restart on various services. Services that do not support rolling restart undergo a normal restart, and are not available during the restart process.

      Configure the following parameters for the rolling restart (optional):

      Roles to include

      Select which roles to restart as part of the rolling restart.

      Batch Size

      Number of roles to include in a batch. Cloudera Manager restarts the worker roles rack-by-rack, in alphabetical order, and within each rack, hosts are restarted in alphabetical order. If you use the default replication factor of 3, Hadoop tries to keep the replicas on at least 2 different racks. So if you have multiple racks, you can use a higher batch size than the default 1. However, using a batch size that is too high means that fewer worker roles are active at any time during the upgrade, which can cause temporary performance degradation. If you are using a single rack, restart one worker node at a time to ensure data availability during upgrade.

      Advanced Options > Sleep between batches

      Amount of time Cloudera Manager waits before starting the next batch. Applies only to services with worker roles.

      Advanced Options > Failed threshold

      The number of batch failures that cause the entire rolling restart to fail. For example, if you have a very large cluster, you can use this option to allow some failures when you are sure that the cluster will still be functional while some worker roles are down.

      Click the Rolling Restart button when you are ready to restart the cluster.

    • Full Cluster Restart

      Cloudera Manager performs all service upgrades and restarts the cluster.

    • Manual Upgrade

      Cloudera Manager configures the cluster to the specified CDH version but performs no upgrades or service restarts. Manually upgrading is difficult and for advanced users only. Manual upgrades allow you to selectively stop and restart services to prevent or mitigate downtime for services or clusters where rolling restarts are not available.

      To perform a manual upgrade: See Upgrading CDH or CDP Private Cloud Base Manually after an Upgrade Failure for the required steps.

  5. Click Continue.

    The Upgrade Cluster Command screen displays the result of the commands run by the wizard as it shuts down all services, activates the new parcels, upgrades services, deploys clients configuration files, and restarts services and performs a rolling restart of the services that support it.

    If any of the steps fail, correct any reported errors and click the Resume button. Cloudera Manager will skip restarting roles that have already successfully restarted. Alternately, return to the Home > Status tab and then perform the steps in Upgrading CDH or CDP Private Cloud Base Manually after an Upgrade Failure.

  6. Click Continue.
    If your cluster was previously installed or upgraded using packages, the wizard may indicate that some services cannot start because their parcels are not available. To download the required parcels:
    1. In another browser tab, open the Cloudera Manager Admin Console.
    2. Select Hosts > Parcels.
    3. Locate the row containing the missing parcel and click the button to Download, Distribute, and then Activate the parcel.
    4. Return to the upgrade wizard and click the Resume button.

      The Upgrade Wizard continues upgrading the cluster.

  7. Click Finish to return to the Home page.

Finalize the HDFS Upgrade

Follow the steps in this section if you are upgrading:
  • CDH 5.0 or 5.1 to 5.2 or higher
  • CDH 5.2 or 5.3 to 5.4 or higher

To determine if you can finalize the upgrade, run important workloads and ensure that they are successful. After you have finalized the upgrade, you cannot roll back to a previous version of HDFS without using backups. Verifying that you are ready to finalize the upgrade can take a long time.

Make sure you have enough free disk space, keeping in mind that the following behavior continues until the upgrade is finalized:
  • Deleting files does not free up disk space.
  • Using the balancer causes all moved replicas to be duplicated.
  • All on-disk data representing the NameNodes metadata is retained, which could more than double the amount of space required on the NameNode and JournalNode disks.
If you have Enabled high availability for HDFS, and you have performed a rolling upgrade:
  1. Go to the HDFS service.
  2. Select Actions > Finalize Rolling Upgrade and click Finalize Rolling Upgrade to confirm.

If you have not performed a rolling upgrade:

  1. Go to the HDFS service.
  2. Click the Instances tab.
  3. Click the link for the NameNode instance. If you have enabled high availability for HDFS, click the link labeled NameNode (Active).

    The NameNode instance page displays.

  4. Select Actions > Finalize Metadata Upgrade and click Finalize Metadata Upgrade to confirm.

Finalize the HDFS Upgrade

To determine if you can finalize the upgrade, run important workloads and ensure that they are successful. After you have finalized the upgrade, you cannot roll back to a previous version of HDFS without using backups. Verifying that you are ready to finalize the upgrade can take a long time.

Make sure you have enough free disk space, keeping in mind that the following behavior continues until the upgrade is finalized:
  • Deleting files does not free up disk space.
  • Using the balancer causes all moved replicas to be duplicated.
  • All on-disk data representing the NameNodes metadata is retained, which could more than double the amount of space required on the NameNode and JournalNode disks.
If you have Enabled high availability for HDFS, and you have performed a rolling upgrade:
  1. Go to the HDFS service.
  2. Select Actions > Finalize Rolling Upgrade and click Finalize Rolling Upgrade to confirm.

If you have not performed a rolling upgrade:

  1. Go to the HDFS service.
  2. Click the Instances tab.
  3. Click the link for the NameNode instance. If you have enabled high availability for HDFS, click the link labeled NameNode (Active).

    The NameNode instance page displays.

  4. Select Actions > Finalize Metadata Upgrade and click Finalize Metadata Upgrade to confirm.

For Sentry with an Oracle Database, Add the AUTHZ_PATH.AUTHZ_OBJ_ID Index

If your cluster uses Sentry and an Oracle database, you must manually add the index on the AUTHZ_PATH.AUTHZ_OBJ_ID column if it does not already exist. Adding the index manually decreases the time Sentry takes to get a full snapshot for HDFS sync. Use the following command to add the index:

CREATE INDEX "AUTHZ_PATH_FK_IDX" ON "AUTHZ_PATH" ("AUTHZ_OBJ_ID");

Complete Post-Upgrade steps for CDH 5 to CDH 6 upgrades

Several components require additional steps after you complete the CDH upgrade:

  • Impala – See Impala Post-Upgrade Changes
  • Cloudera Search

    After upgrading to CDH 6, you must re-index your collections, see Re-indexing Solr collections after upgrading the cluster.

  • Spark – See Apache Spark Post Upgrade Transition Steps.
  • MapReduce 1 to MapReduce 2 – See Transitioning from MapReduce 1 to MapReduce 2
  • Kudu – See Upgrade Notes for Kudu 1.10 / CDH 6.3
  • Kafka
    1. Remove the following properties from the Kafka Broker Advanced Configuration Snippet (Safety Valve) configuration property.
      • Inter.broker.protocol.version
      • log.message.format.version
    2. Save your changes.
    3. Restart the cluster:
      1. On the Home > Status tab, click to the right of the cluster name and select Restart.
      2. Click Restart that appears in the next screen to confirm. If you have enabled high availability for HDFS, you can choose Rolling Restart instead to minimize cluster downtime. The Command Details window shows the progress of stopping services.

        When All services successfully started appears, the task is complete and you can close the Command Details window.

  • HBase - When upgrading to CDH 5.16.1, the hbase/thrift configuration gets broken and needs to be fixed.
    1. Ensure that you have an HBase Thrift Server instance.
      If you do not have an HBase Thrift Server, do the following:
      1. Select the HBase service and click the Instances tab.
      2. Click the Add Role Instances tab.
      3. Follow the wizard to add an HBase Thrift Server Role Instance.
    2. Select the Hue service and click the Configuration tab.
    3. Search for hbase.
    4. Ensure that HBase Service and HBase Thrift Server are set to other than none.
    5. If you use Kerberos, do the following:
      1. Select the HBase service and click the Configuration tab.
      2. Search for hbase thrift authentication.
      3. Set HBase Thrift Authentication to one of the following options:
        • auth-conf: authentication, integrity and confidentiality checking
        • auth-int: authentication and integrity checking
        • auth: authentication only
      4. If you use Impersonation, do the following:
        1. Search for hbase thrift.
        2. Ensure that both Enable HBase Thrift Http Server and Enable HBase Thrift Proxy User are checked.
      5. Verify that HBase allows proxy users:
        1. Navigate to the directory /var/run/cloudera-scm-agent/process/<id>-hbase-HBASETHRIFTSERVER.
        2. Check the core-site.xml file and verify that HBase is authorized to impersonate someone:
          <property>
          <name>hadoop.proxyuser.hbase.hosts</name>
          <value>*</value>
          </property>
            
          <property>
          <name>hadoop.proxyuser.hbase.groups</name>
          <value>*</value>
          </property>
          
    6. Select the Hue service and click the Configuration tab.
    7. Search for hue_safety_valve.ini.
    8. Find Hue Service Advance Configuration Snippet (Safety Valve) for hue_safety_valve.ini and add the following snippet:
      [hbase]
      hbase_conf_dir={{HBASE_CONF_DIR}}
    9. If you are using CDH 5.15.0 or higher, add the following snippet in the above hbase section:
      thrift_transport=buffered
    10. Restart the HBase and Hue service by clicking the Stale Service Restart icon that is next to the service. It invokes the cluster restart wizard.
  • YARN
    • Considering logical processors in the calculation: The yarn.nodemanager.resource.count-logical-processors-as-cores property was not present in CDH 5. In CDH 6, it is set to false by default, meaning that YARN does not consider logical processors in the calculation which can results in a 2x performance hit if Linux Container Executor and CGroups are enabled. To solve this issue, set yarn.nodemanager.resource.count-logical-processors-as-cores=true and restart the NodeManager.

Complete Post-Upgrade steps for upgrades to CDP Private Cloud Base

Several components require additional steps after you complete the upgrade to CDP Private Cloud Base:
  • Apache Hive See Hive Post-Upgrade Tasks.
  • Kafka
    1. Remove the following properties from the Kafka Broker Advanced Configuration Snippet (Safety Valve) configuration property.
      • Inter.broker.protocol.version
      • log.message.format.version
    2. Save your changes.
    3. Perform a rolling restart:
      1. Select the Kafka service.
      2. Click Actions > Rolling Restart.
      3. In the pop-up dialog box, select the options you want and click Rolling Restart.
      4. Click Close once the command has finished.
  • Kudu See Upgrade Notes for Kudu 1.12 / CDP 7.1
  • YARN
    • Scheduler: If you are using Fair Scheduler, you must migrate to Capacity Scheduler during the upgrade process, and once the upgrade is finished you need to manually fine tune it. For more information, see Manual configuration of scheduler properties.
    • Considering logical processors in the calculation: The yarn.nodemanager.resource.count-logical-processors-as-cores property was not present in CDH 5. In Cloudera Runtime 7.1.1, it is set to false by default, meaning that YARN does not consider logical processors in the calculation which can results in a 2x performance hit if Linux Container Executor and CGroups are enabled. To solve this issue, set yarn.nodemanager.resource.count-logical-processors-as-cores=true and restart the NodeManager.
    • NodeManager recovery: By default the yarn.nodemanager.recovery.enabled property is set to true after the upgrade. If in your source cluster you disable the NodeManager recovery feature, and you want to keep that setting, you manually have to disable it again in the upgraded cluster. Note, that Cloudera recommends to have this feature enabled.
    • Log aggregation: In order to see the history of applications that were launched before upgrade, do the following:
      1. In Cloudera Manager, navigate to YARN > Configuration > Category: Log aggregation.
      2. Set the following configurations:
        yarn.log-aggregation.TFile.remote-app-log-dir-suffix=logs
        
        yarn.log-aggregation.IFile.remote-app-log-dir-suffix=logs-ifile
    • Maximum capacity: Set the yarn.scheduler.capacity.<queuepath>.user-limit-factor to a value that is greater than 1. This configuration will help to grow the queue usage beyond its configured capacity till its maximum capacity configured.
    • Ranger Plugins
      The following Ranger plugins are not enabled by default after the upgrade. If these services are configured in the cluster, you will need to manually enable the plugins in order for them to use Ranger:
      The following Ranger plugins are enabled after an upgrade:
      • Atlas
      • HDFS
      • Hive
      • Hive on Tez
      • Impala
      • Kafka
  • ZooKeeper

    Ensure, that QuorumSSL (Secure ZooKeeper) is enabled only if QuorumSASL (Server to server SASL authentication) is also enabled. Note, that QuorumSSL is enabled by default if AutoTLS is enabled. If QuorumSSL is enabled without QuorumSASL, then the ZooKeeper cluster can be slow to start due to some known ZooKeeper limitations.

  • Solr – See Re-indexing Solr collections after upgrading the cluster.
  • Sentry – See Sentry to Ranger Permissions.
  • Impala – See Apache Impala changes in CDP

Exit Maintenance Mode

If you entered maintenance mode during this upgrade, exit maintenance mode.

On the Home > Status tab, click next to the cluster name and select Exit Maintenance Mode.