Upgrading Apache Spark 2.4.8 (with CDS 3.3.2) on 7.1.9 SP1 to Spark 3 on 7.3.1
The following steps will help you upgrading from Apache Spark 2.4.8 on Cloudera Private Cloud 7.1.9 SP1 to Spark 3.4.1 on 7.3.1.
| Source cluster version | Source cluster Spark 2 version | Source cluster Spark 3 version | Target cluster version | Target cluster Spark 3 version | Spark 2 used with connectors1 | 
|---|---|---|---|---|---|
| 7.1.9 SP1 | 2.4.8 | 3.3.2 (CDS) | 7.3.1 | 3.4.1 | no | 
1  Oozie, Solr, Phoenix, Hive Warehouse Connector, Spark Schema Registry
Application migration tasks from Spark 2 to Spark 3
              Follow the Spark application migration documentation to migrate your Apache Spark Applications from version 2.4.8 to 3.3.2.
        
- Check the supported Java versions.
 - Check the supported Scala version.
 - Check the supported Python versions.
 - 
                    Account for changed or versioned Spark commands in your code. (
spark-submit,pyspark, etc.) - Check supported versions for Spark connectors.
 - Check the logging library used in your code.
 - Check the compatibility of 3rd-party libraries used in your code.
 - Check Spark behavior changes and refactor your code.
 
Post-application migration tasks
- 
                Stop the Livy (
Livy for Spark 2) and Spark 2 (SPARK_ON_YARN) services. - Delete the Spark 2 and Livy for Spark 2 services.
 - Move Spark 2 event logs to the Spark 3 event logs directory.
 
In-place cluster upgrade
Spark application migration (from Spark 3.x to Spark 3.4.1)
              Follow the Spark application migration documentation to migrate your Apache Spark Applications from version 3.3.2 to 3.4.1
        
- Refactor your Spark application code.
 
Final steps
After the upgrade and application migration are complete:
- Check the status of your clusters.
 - Perform benchmark testing on your applications. See Spark Application Migration.
 
