Upgrading Apache Spark 3.3.2 (bundled) on 7.2.17 to Spark 3 on 7.3.1

The following steps will help you upgrading from Apache Spark 3.3.2 (bundled) on Cloudera Public Cloud 7.2.17 to Spark 3.4.1 on 7.3.1.

Source cluster version Source cluster Spark 2 version Source cluster Spark 3 version Target cluster version Target cluster Spark 3 version Spark 2 used with connectors1
7.2.17 none 3.3.2 (bundled) 7.3.1 3.4.1 no

In-place cluster upgrade

  1. Upgrade the cluster OS from Centos 7 to RedHat 8.
  2. Upgrade the Data Lake cluster to 7.3.1
    1. Check the support matrix for Data Hub upgrades.
    2. Stop all Data Hubs attached to the environment.
    3. From the Management Console, click Data Lakes > Environment Name, scroll to the bottom of the Data Lake details page, and click the Upgrade tab.
    4. Click the Target Cloudera Runtime Version drop-down menu to see any available upgrades.
    5. If you want to skip the automatic backup that is taken before the upgrade, uncheck the Automatic backup box.
    6. Click Validate and Prepare to check for any configuration issues and begin the Cloudera Runtime parcel download and distribution.
    7. Click Upgrade to initiate the upgrade.
    8. Click the Event History tab to monitor the upgrade process and verify that it completes successfully.
    For more information, see Data Lake upgrade.
  3. Upgrade the new Data Hub cluster to 7.3.1
    1. Check the support matrix for Data Hub upgrades.
    2. Start the cluster.
    3. Check the current version of Cloudera Runtime.
    4. If your cluster uses Streams Replication Manager, export or migrate aggregated metrics.
    5. If you use autoscaling, disable autoscaling on the cluster.
    6. Upgrade the cluster.
    7. Monitor the upgrade progress using the Data Hub Event History tab.
    8. When the upgrade is complete, verify the new version.
    9. If you disabled autoscaling on the cluster, you can re-enable it after upgrade.
    For more information, see Upgrading Data Hubs.

Application migration tasks (Spark 3.x to 3.4.1)

Follow the Spark application migration documentation to migrate your Apache Spark Applications from version 3.3.2 to 3.4.1
  1. Refactor your Spark application code.

Final steps

After the upgrade and application migration are complete:
  1. Check the status of your Data Lakes, Data Hubs, and clusters.
  2. Perform benchmark testing on your applications. See Spark Application Migration.
1 Oozie, Solr, Phoenix, Hive Warehouse Connector, Spark Schema Registry