Creating and updating Apache Spark jobs using the CLI

The following example demonstrates how to create a Spark application in Cloudera Data Engineering using the command line interface (CLI).

Make sure that you have downloaded the CLI client. For more information, see Using the Cloudera Data Engineering command line interface. While creating a job if you want to use the [--data-connector] flag, you must obtain the name of the data connector from the Cloudera Data Engineering UI by navigating to Administration > click Service Details icon of the Cloudera Data Engineering Service > Data Connectors tab.

Run the cde job create command as follows:
```
cde job create --application-file <path_to_application_jar> --mount-1-resource <your_CDE_resource> --class <application_class> [--default-variable name=value] [--data-connector name] --name <job_name> --num-executors <num_executors> --type spark
```
To see the full command syntax and supported options, run cde job create --help.
The application file can either be a local JAR file inside the Spark container or can be mounted from a Cloudera Data Engineering Resource. If it is from a Cloudera Data Engineering Resource, the --application-file points to the file from within the Cloudera Data Engineering Resource. If you want to configure the job to pick up the application file from a Cloudera Data Engineering Repository, provide the name of the repository with the --mount-1-resource flag and provide the --application-file path as you would do with a Cloudera Data Engineering Resource.
With [--default-variable] flags you can replace strings in job values. Currently the supported fields are:
- Spark application name
- Spark arguments
- Spark configurations
For a variable flag name=value any substring {{{name}}} in the value of the supported field gets replaced with value. These can be overriden by the [--variable] flag during the job run.
Using the [--data-connector] flag, you can specify the name of the data connector. Currently, only the Ozone type data connector is supported and it must be created before the job run.
Run cde job describe to verify that the job was created:
```
cde job describe --name <job_name>
```
If you want to update the job configuration, use the cde job update command.
For example, to change the number of executors:
```
cde job update --name test_job --num-executors 15
```
To see the full command syntax and supported options, run cde job update --help.
To verify the updated configuration, run cde job describe again:
```
cde job describe --name <job_name>
```

Creating and updating Apache Spark jobs using the CLI

We want your opinion

How can we improve this page?