Step 5: Complete Pre-Upgrade steps for upgrades to Cloudera Base on premises
Steps to complete before upgrading CDH to CDP.
Loading Filters ... 5.16 5.15 5.14 5.13 7.6.7 7.4.4 5.16 5.15 5.14 5.13 7.1.7.2000 7.1.7
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
Ensure that you have completed the following steps when upgrading from CDH 5.x to Cloudera Base on premises 7.1.
- Cloudera Search – See Transitioning Cloudera Search configuration before upgrading to Cloudera Runtime.
- Flume – Flume is not supported in Cloudera Base on premises. You must remove the Flume service before upgrading to Cloudera Base on premises.
- HBase – See Checking Apache HBase.
- Hive – See Migrating Hive 1-2 to Hive 3
- Kafka – In CDH 5.x, Kafka was delivered as a
        separate parcel and could be installed along with CDH 5.x using Cloudera
        Manager. In Runtime 7.0.3 and later, Kafka is part of the Cloudera
        Runtime distribution and is deployed as part of the Cloudera Runtime
        parcels. To successfully upgrade Kafka you need to set the protocol
        version to match what's being used currently among the brokers and
        clients. - Explicitly set the Kafka protocol version to match what's being used
              currently among the brokers and clients. Update kafka.properties
              on all brokers as follows:- Log in to the Cloudera Manager Admin Console
- Choose the Kafka service.
- Click Configuration.
- Use the Search field to find the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties configuration property.
- Add the following properties to the snippet:- inter.broker.protocol.version = [***CURRENT KAFKA VERSION***]
- log.message.format.version = [***CURRENT KAFKA VERSION***]
 2018-06-14 14:25:47,818 FATAL kafka.Kafka$: java.lang.IllegalArgumentException: Version `0.10` is not a valid version at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72) at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72) at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
 
- Save your changes. The information is automatically copied to each broker.
 
- Explicitly set the Kafka protocol version to match what's being used
              currently among the brokers and clients. Update kafka.properties
              on all brokers as follows:
- MapReduce – See Transitioning from MapReduce 1 to MapReduce 2.
- Navigator –- See Transitioning Navigator content to Atlas
- Replication Schedules – See CDH cluster upgrade requirements for Replication Manager.
- Sentry The Sentry service has been replace with Apache Ranger in Cloudera Runtime 7.1. You must perform several steps before upgrading your cluster. See Transitioning the Sentry service to Apache Ranger.
- Virtual Private Clusters:If your deployment has defined a Compute cluster and an associated Data Context, you will need to delete the Compute cluster and Data context before upgrading the base cluster and then recreate the Compute cluster and Data context after the upgrade. 
- YARN : Decommission and recommission the YARN NodeManagers but
        do not start the NodeManagers. A decommission is required so that the
        NodeManagers stop accepting new containers, kill any running containers,
        and then shutdown.- Ensure that new applications, such as MapReduce or Spark applications, will not be submitted to the cluster until the upgrade is complete.
- In the Cloudera Manager Admin Console, navigate to the YARN service for the cluster you are upgrading.
- On the Instances tab, select all the NodeManager roles. This can be done by filtering for the roles under Role Type.
- Click .If the cluster runs CDH 5.9 or higher and is managed by Cloudera Manager 5.9 or higher, and you configured graceful decommission, the countdown for the timeout starts. A Graceful Decommission provides a timeout before starting the decommission process. The timeout creates a window of time to drain already running workloads from the system and allow them to run to completion. Search for the Node Manager Graceful Decommission Timeout field on the Configuration tab for the YARN service, and set the property to a value greater than 0 to create a timeout. 
- Wait for the decommissioning to complete. The NodeManager State is Stopped and the Commission State is Decommissioned when decommissioning completes for each NodeManager.
- With all the NodeManagers still selected, click .
 
- HDFS: Review the current JVM heap size for the DataNodes on your cluster and ensure
        that the heap size is configured at the rate of 1 GB for every million blocks. Use the
          Java Heap Size of DataNode in Bytes property to configure the
        value. 
        In addition, you can track the JVM heap usage through Cloudera Manager charts, as specified in the following steps:- Open the Cloudera Manager Admin Console.
- Go to the HDFS service.
- Click the Charts Library tab.
- Select DataNodes from the list on the left.
- Click the Memory tab.
- Look at the chart titled DataNode JVM Heap Used Distribution. The maximum heap usage usage is the value in the last bucket of that histogram.
 
