Running applications with CDS 3.1 for GPUs

With CDS 3.1 for GPUs, you can run Apache Spark 3 applications locally or distributed across a cluster, either by using an interactive shell or by submitting an application. Running Spark applications interactively is commonly performed during the data-exploration phase and for ad hoc analysis.

The Spark 3 job commands

With Spark 3, you use slightly different command names than with Spark 2, so that you can run both versions of Spark side-by-side without conflicts:

  • spark3-submit instead of spark-submit.
  • spark3-shell instead of spark-shell.
  • pyspark3 instead of pyspark.

For development and test purposes, you can also configure an alias on each host so that invoking the Spark 2 command name runs the corresponding Spark 3 executable.

Accessing the Spark 3 History Server

The Spark 3 history server is available on port 18089, rather than port 18088 as with the Spark 2 history server.