Submitting a Spark job using the CLI

The following example demonstrates how to submit a JAR or Python file to run on CDE Spark in Cloudera Data Engineering (CDE) using the command line interface (CLI).

Using the cde spark submit command is a quick and efficient way of testing a spark job, as it spares you the task of creating and uploading resources and job definitions before running the job, and cleaning up after running the job.

This command is recommended only for JAR or Python files that need to be run just once, because the the file is removed from Spark at the end of the run. To manage jobs that need to be run more than once, or that contain schedules, use cde job run instead of this command.

To submit a JAR or Python file to run on CDE Spark, use the CLI command:
cde spark submit <JAR/Python file> [args...] [Spark flags...] [--job-name <job name>] [--hide-logs]

To see the full command syntax and supported options, run cde spark submit --help.

For example:

To submit a job with a local JAR file:
cde spark submit my-spark-app-0.1.0.jar 100 1000 --class com.company.app.spark.Main

The CLI displays the job run ID followed by the driver logs, unless you specified the --hide-logs option. The script returns an exit code of 0 for success or 1 for failure.