Submitting a Spark job using the CLI
The following example demonstrates how to submit a JAR or Python file to run on CDE Spark in Cloudera Data Engineering (CDE) using the command line interface (CLI).
Using the cde spark submit
command is a quick and efficient way of
testing a spark job, as it spares you the task of creating and uploading resources
and job definitions before running the job, and cleaning up after running the
job.
This command is recommended only for JAR or Python files that need to be run just
once, because the the file is removed from Spark at the end of the run. To manage
jobs that need to be run more than once, or that contain schedules, use cde
job run
instead of this command.
cde spark submit <JAR/Python file> [args...] [Spark flags...] [--job-name <job name>] [--hide-logs]
You can use [--job-name <job name>]
to specify the same
CDE job name for consecutive cde spark submit
commands. To
see the full command syntax and supported options, run cde spark
submit --help
.
For example:
To submit a job with a local JAR file:cde spark submit my-spark-app-0.1.0.jar 100 1000 --class com.company.app.spark.Main
The CLI displays the job run ID followed by the driver logs, unless you
specified the --hide-logs
option. The script returns an
exit code of 0
for success or 1
for
failure.