Step 5: Complete Pre-Upgrade steps for upgrades to Cloudera Base on premises

Steps to complete before upgrading CDH to CDP.

My Environment

Fill in the following form to create a customized set of instructions for your environment.

Zero Downtime Upgrade?

New Cloudera Manager Version

Install Method

Operating System

HDFS High Availability

Ozone Upgrade Finalization

Using Cloudera Navigator

Current Cluster Version

New Cluster Version

Refreshing ContentFill out the form above before you proceed. Content Updated

To share this environment with others, click the icon next to My Environment to copy a link specific for this environment to the clipboard.

6.3.4 6.3.3 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 7.11.3 7.7.3 7.7.1 7.6.7 7.6.1 7.4.4 6.3.4 6.3.3 6.3.2 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.2 6.0.1 7.1.9.1000 7.1.9 7.1.8 7.1.7.3000 7.1.7.2000 7.1.7.1000 7.1.7

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

Ensure that you have completed the following steps when upgrading from CDH 5.x to Cloudera Base on premises 7.1.

Flume – Flume is not supported in Cloudera Base on premises. You must remove the Flume service before upgrading to Cloudera Base on premises.
HBase – See Checking Apache HBase.
Hive – See Migrating Hive 1-2 to Hive 3
Kafka – In CDH 5.x, Kafka was delivered as a separate parcel and could be installed along with CDH 5.x using Cloudera Manager. In Runtime 7.0.3 and later, Kafka is part of the Cloudera Runtime distribution and is deployed as part of the Cloudera Runtime parcels. To successfully upgrade Kafka you need to set the protocol version to match what's being used currently among the brokers and clients.
important
Upgrading CDK to Cloudera Runtime 7.1.1 or higher is only supported from CDK 4.1.0. If you are running an earlier version of CDK, you must first upgrade to CDK 4.1.0 before upgrading to Cloudera Runtime 7.1.1.
1. Explicitly set the Kafka protocol version to match what's being used currently among the brokers and clients. Update kafka.properties on all brokers as follows:
  1. Log in to the Cloudera Manager Admin Console
  2. Choose the Kafka service.
  3. Click Configuration.
  4. Use the Search field to find the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties configuration property.
  5. Add the following properties to the snippet:
    - inter.broker.protocol.version = [***CURRENT KAFKA VERSION***]
    - log.message.format.version = [***CURRENT KAFKA VERSION***]
    Replace [***CURRENT KAFKA VERSION***] with the version of Apache Kafka currently being used. See the Product Compatibility Matrix for CDK Powered By Apache Kafka to find out which upstream version is used by which version of CDK. Make sure you enter full Apache Kafka version numbers with three values, such as 0.10.0. Otherwise, you will see an error message similar to the following:
```
2018-06-14 14:25:47,818 FATAL kafka.Kafka$:
java.lang.IllegalArgumentException: Version `0.10` is not a valid version
        at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72)
        at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72)
        at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
```
2. Save your changes. The information is automatically copied to each broker.
Kafka – To successfully upgrade Kafka, you need to set the protocol version to match what's being used currently among the brokers and clients. Following a successful upgrade, you will need to reset the configuration change made.
1. Explicitly set the Kafka protocol version to match what's being used currently among the brokers and clients. Update kafka.properties on all brokers as follows:
  1. Log in to the Cloudera Manager Admin Console
  2. Choose the Kafka service.
  3. Click Configuration.
  4. Use the Search field to find the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties configuration property.
  5. Add the following properties to the snippet:
    - inter.broker.protocol.version = [***CURRENT KAFKA VERSION***]
    - log.message.format.version = [***CURRENT KAFKA VERSION***]
    Replace [***CURRENT KAFKA VERSION***] with the version of Apache Kafka currently being used. See the CDH 6 Packaging Information to find out which upstream version is used by which version of CDH 6. Make sure you enter full Apache Kafka version numbers with three values, such as 0.10.0. Otherwise, you will see an error message similar to the following:
```
2018-06-14 14:25:47,818 FATAL kafka.Kafka$:
java.lang.IllegalArgumentException: Version `0.10` is not a valid version
        at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72)
        at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72)
        at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
```
2. Save your changes. The information is automatically copied to each broker.
Navigator –- See Transitioning Navigator content to Atlas
Replication Schedules – See CDH cluster upgrade requirements for Replication Manager.
Sentry The Sentry service has been replace with Apache Ranger in Cloudera Runtime 7.1. You must perform several steps before upgrading your cluster. See Transitioning the Sentry service to Apache Ranger.
Virtual Private Clusters:
If your deployment has defined a Compute cluster and an associated Data Context, you will need to delete the Compute cluster and Data context before upgrading the base cluster and then recreate the Compute cluster and Data context after the upgrade.
YARN : Decommission and recommission the YARN NodeManagers but do not start the NodeManagers. A decommission is required so that the NodeManagers stop accepting new containers, kill any running containers, and then shutdown.
1. Ensure that new applications, such as MapReduce or Spark applications, will not be submitted to the cluster until the upgrade is complete.
2. In the Cloudera Manager Admin Console, navigate to the YARN service for the cluster you are upgrading.
3. On the Instances tab, select all the NodeManager roles. This can be done by filtering for the roles under Role Type.
4. Click Actions for Selected (number) > Decommission.
  If the cluster runs CDH 5.9 or higher and is managed by Cloudera Manager 5.9 or higher, and you configured graceful decommission, the countdown for the timeout starts.
  A Graceful Decommission provides a timeout before starting the decommission process. The timeout creates a window of time to drain already running workloads from the system and allow them to run to completion. Search for the Node Manager Graceful Decommission Timeout field on the Configuration tab for the YARN service, and set the property to a value greater than 0 to create a timeout.
5. Wait for the decommissioning to complete. The NodeManager State is Stopped and the Commission State is Decommissioned when decommissioning completes for each NodeManager.
6. With all the NodeManagers still selected, click Actions for Selected (number) > Recommission.
  important
  Do not start the NodeManagers.
HDFS: Review the current JVM heap size for the DataNodes on your cluster and ensure that the heap size is configured at the rate of 1 GB for every million blocks. Use the Java Heap Size of DataNode in Bytes property to configure the value.
note
If upgrading to 7.1.7 or greater, you need not increase the heap size.
In addition, you can track the JVM heap usage through Cloudera Manager charts, as specified in the following steps:
1. Open the Cloudera Manager Admin Console.
2. Go to the HDFS service.
3. Click the Charts Library tab.
4. Select DataNodes from the list on the left.
5. Click the Memory tab.
6. Look at the chart titled DataNode JVM Heap Used Distribution. The maximum heap usage usage is the value in the last bucket of that histogram.

Step 5: Complete Pre-Upgrade steps for upgrades to Cloudera Base on premises

We want your opinion

How can we improve this page?