Updating Spark 3.3 applications for Spark 3.5

You must update your Apache Spark 2 applications to run on Spark 3. The Apache Spark documentation provides a migration guide.

The comprehensive guide of behavior changes between versions are available in the Apache Spark Migration Guides.

Always refer to the following documents to ensure that your Spark application remains compatible with newer Spark versions:

Migrating from CDS 3.3

To migrate your Spark applications from CDS 3.3 to CDS 3.5, you need to install (download and distribute) the CDS 3.5 parcel in your cluster. Refer to Installing CDS 3.5 for more information.

  1. Stop your existing Spark 3 jobs.
  2. Download and activate the CDS 3.5 parcel.
  3. Reconfigure your applications for Spark 3.5.4.
    Refer to the Apache Spark guides for detailed information on changes between Spark 3.3.2 and 3.5.4.
    1. If your applications are written in Python, you're likely only need to check for how Apache Spark changes impact your use case.
    2. If your applications are written in Java or Scala, check for how Apache Spark changes impact your use case, and refactor your code, changing all version references to use CDS 3.5.4 dependencies.
    3. Recompile your Java and Scala applications to use the new Spark version.
  4. Start your migrated Spark 3 jobs, and test performance and functionality.
  5. After confirming your applications are working, you can delete the CDS 3.3 parcel from your cluster.
    See Managing parcels for more information.