Migrating Apache Spark Before Upgrading to CDH 6

If you are upgrading to CDH 6 from CDH 5 and have a Spark service installed, there are several pre-upgrade steps you might need to take:

Remove Spark (Standalone) Service

If you have a Spark (Standalone) service in your cluster, remove it before starting the CDH upgrade:
  1. Log in to the Cloudera Manager Admin Console.
  2. Click the drop-down arrow next to the Spark (Standalone) service and select Stop.
  3. Click the drop-down arrow next to the Spark (Standalone) service and select Delete.

Set Alternatives Priorities for Built-in CDH 5 Spark (1.6) and CDS 2

If you are using both the built-in Spark 1.6 service in CDH 5 and a CDS 2 parcel, and both services have Gateway roles on the same hosts, increase the alternatives priority of the service that you want to use as the default service after the upgrade.

To set the alternatives priority
  1. Log in to the Cloudera Manager Admin Console.
  2. Select the Cluster where the Spark services are running.
  3. Select the Spark service that you want to use as the default service after upgrading.
  4. Click the Configuration tab.
  5. Search for the Alternatives Priority property.
  6. Set the value higher than the Alternatives Priority for any other Spark service.
  7. Click Save Changes.

Remove CDS 2 Version Higher than the CDH 6 Spark 2 Version

If you are using a CDS 2 minor version higher than the version of Spark 2 included in the CDH 6 release you are upgrading to (2.2 in CDH 6.0.0), you must remove your Spark 2 services from Cloudera Manager. For the purpose of this evaluation, you can ignore maintenance versions. For example, if the Spark 2 version in the CDH 6 version you are upgrading to is 2.2, and you are using any maintenance version of CDS 2.2.0, you do not need to remove your Spark 2 services from Cloudera Manager. They will be automatically converted to use the built-in CDH 6 Spark version, and the CDS parcel will be disabled.

Deleting a Spark service in Cloudera Manager does not delete the associated event logs from HDFS. The CDH 6 upgrade wizard installs Spark 2.2.

To remove the Spark service:
  1. Log in to the Cloudera Manager Admin Console.
  2. Select the Cluster where the Spark 2 service is running.
  3. Click the drop-down arrow next to the Spark service and select Stop.
  4. Click the drop-down arrow next to the Spark service and select Delete.
  5. After upgrading to CDH 6, add the Spark service. For instructions, see Adding a Service.