Upgrading Apache Spark 2.4.7 on 7.1.7 SP3 to Spark 3 on 7.3.1
The following steps will help you upgrading from Apache Spark 2.4.7 on Cloudera on premises 7.1.7 SP3 to Spark 3.4.1 on 7.3.1.
| Source cluster version | Source cluster Spark 2 version | Source cluster Spark 3 version | Target cluster version | Target cluster Spark 3 version | Spark 2 used with connectors1 | 
|---|---|---|---|---|---|
| 7.1.7 SP3 | 2.4.7 | none | 7.3.1 | 3.4.1 | no | 
1  Oozie, Solr, Phoenix, Hive Warehouse Connector, Spark Schema Registry
Pre-application migration tasks
              Install the CDS parcel and Spark 3 services, as described in the CDS parcel documentation. A short overview of the process is as follows:
        
- Check that all the software prerequisites are satisfied.
 - In the Admin Console, add the CDS parcel repository to the Remote Parcel Repository URLs in Parcel Settings.
 - Download the CDS parcel, distribute it to the hosts in your cluster, and activate it.
 - Add the Spark 3 service to your cluster.
 - Return to the Home page.
 - Click the stale configuration icon to launch the Stale Configuration wizard and restart the necessary services.
 
Application migration tasks (Spark 2 to 3)
              Follow the Spark application migration documentation to migrate your Apache Spark Applications from version 2.4.7 to 3.2.3
        
- Check the supported Java versions.
 - Check the supported Scala version.
 - Check the supported Python versions.
 - 
                    Account for changed or versioned Spark commands in your code. (
spark-submit,pyspark, etc.) - Check supported versions for Spark connectors.
 - Check the logging library used in your code.
 - Check the compatibility of 3rd-party libraries used in your code.
 - Check Spark behavior changes and refactor your code.
 
Post-application migration tasks
- 
                Stop the Livy (
Livy for Spark 2) and Spark 2 (SPARK_ON_YARN) services. - Delete the Spark 2 and Livy for Spark 2 services.
 - Move Spark 2 event logs to the Spark 3 event logs directory.
 
In-place cluster upgrade
Application migration tasks (Spark 3.x to 3.4.1)
              Follow the Spark application migration documentation to migrate your Apache Spark Applications from version 3.2.3 to 3.4.1
        
- Refactor your Spark application code.
 
Fianl steps
After the upgrade and application migration are complete:
- Check the status of your clusters.
 - Perform benchmark testing on your applications. See Spark Application Migration.
 
