Step 5: Complete Pre-Upgrade steps for upgrades to Cloudera Private Cloud Base
Steps to complete before upgrading CDH to CDP.
Loading Filters ... 6.3.4 6.3.3 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 7.11.3 7.7.3 7.7.1 7.6.7 7.6.1 7.4.4 6.3.4 6.3.3 6.3.2 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.2 6.0.1 7.1.9.1000 7.1.9 7.1.8 7.1.7.3000 7.1.7.2000 7.1.7.1000 7.1.7
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
Ensure that you have completed the following steps when upgrading from CDH 5.x to Cloudera Private Cloud Base 7.1.
- Flume – Flume is not supported in Cloudera Private Cloud Base. You must remove the Flume service before upgrading to Cloudera Private Cloud Base.
- HBase – See Checking Apache HBase.
- Hive – See Migrating Hive 1-2 to Hive 3
- Kafka – In CDH 5.x, Kafka was delivered as a
separate parcel and could be installed along with CDH 5.x using Cloudera
Manager. In Runtime 7.0.3 and later, Kafka is part of the Cloudera
Runtime distribution and is deployed as part of the Cloudera Runtime
parcels. To successfully upgrade Kafka you need to set the protocol
version to match what's being used currently among the brokers and
clients.
- Explicitly set the Kafka protocol version to match what's being used
currently among the brokers and clients. Update kafka.properties
on all brokers as follows:
- Log in to the Cloudera Manager Admin Console
- Choose the Kafka service.
- Click Configuration.
- Use the Search field to find the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties configuration property.
- Add the following properties to the snippet:
inter.broker.protocol.version = [***CURRENT KAFKA VERSION***]
log.message.format.version = [***CURRENT KAFKA VERSION***]
2018-06-14 14:25:47,818 FATAL kafka.Kafka$: java.lang.IllegalArgumentException: Version `0.10` is not a valid version at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72) at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72) at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
- Save your changes. The information is automatically copied to each broker.
- Explicitly set the Kafka protocol version to match what's being used
currently among the brokers and clients. Update kafka.properties
on all brokers as follows:
- Kafka – To successfully upgrade Kafka, you
need to set the protocol version to match what's being used currently
among the brokers and clients. Following a successful upgrade, you will
need to reset the configuration change made.
- Explicitly set the Kafka protocol version to match what's being used
currently among the brokers and clients. Update kafka.properties on
all brokers as follows:
- Log in to the Cloudera Manager Admin Console
- Choose the Kafka service.
- Click Configuration.
- Use the Search field to find the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties configuration property.
- Add the following properties to the snippet:
inter.broker.protocol.version = [***CURRENT KAFKA VERSION***]
log.message.format.version = [***CURRENT KAFKA VERSION***]
2018-06-14 14:25:47,818 FATAL kafka.Kafka$: java.lang.IllegalArgumentException: Version `0.10` is not a valid version at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72) at kafka.api.ApiVersion$$anonfun$apply$1.apply(ApiVersion.scala:72) at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
- Save your changes. The information is automatically copied to each broker.
- Explicitly set the Kafka protocol version to match what's being used
currently among the brokers and clients. Update kafka.properties on
all brokers as follows:
- Navigator –- See Transitioning Navigator content to Atlas
- Replication Schedules – See CDH cluster upgrade requirements for Replication Manager.
- Sentry The Sentry service has been replace with Apache Ranger in Cloudera Runtime 7.1. You must perform several steps before upgrading your cluster. See Transitioning the Sentry service to Apache Ranger.
- Virtual Private Clusters:
If your deployment has defined a Compute cluster and an associated Data Context, you will need to delete the Compute cluster and Data context before upgrading the base cluster and then recreate the Compute cluster and Data context after the upgrade.
- YARN : Decommission and recommission the YARN NodeManagers but
do not start the NodeManagers. A decommission is required so that the
NodeManagers stop accepting new containers, kill any running containers,
and then shutdown.
- Ensure that new applications, such as MapReduce or Spark applications, will not be submitted to the cluster until the upgrade is complete.
- In the Cloudera Manager Admin Console, navigate to the YARN service for the cluster you are upgrading.
- On the Instances tab, select all the NodeManager roles. This can be done by filtering for the roles under Role Type.
- Click
If the cluster runs CDH 5.9 or higher and is managed by Cloudera Manager 5.9 or higher, and you configured graceful decommission, the countdown for the timeout starts.
A Graceful Decommission provides a timeout before starting the decommission process. The timeout creates a window of time to drain already running workloads from the system and allow them to run to completion. Search for the Node Manager Graceful Decommission Timeout field on the Configuration tab for the YARN service, and set the property to a value greater than 0 to create a timeout.
. - Wait for the decommissioning to complete. The NodeManager State is Stopped and the Commission State is Decommissioned when decommissioning completes for each NodeManager.
- With all the NodeManagers still selected, click .
- HDFS: Review the current JVM heap size for the DataNodes on your cluster and ensure
that the heap size is configured at the rate of 1 GB for every million blocks. Use the
Java Heap Size of DataNode in Bytes property to configure the
value.
In addition, you can track the JVM heap usage through Cloudera Manager charts, as specified in the following steps:
- Open the Cloudera Manager Admin Console.
- Go to the HDFS service.
- Click the Charts Library tab.
- Select DataNodes from the list on the left.
- Click the Memory tab.
- Look at the chart titled DataNode JVM Heap Used Distribution. The maximum heap usage usage is the value in the last bucket of that histogram.