Creating jobs in Cloudera Data Engineering

A job in Cloudera Data Engineering (CDE) consists of defined configurations and resources (including application code). Jobs can be run on demand or scheduled.

In Cloudera Data Engineering (CDE), jobs are associated with virtual clusters. Before you can create a job, you must create a virtual cluster that can run it. For more information, see Creating virtual clusters.

  1. Navigate to the Cloudera Data Engineering Overview page by clicking the Data Engineering tile in the Cloudera Data Platform (CDP) management console.
  2. In the Environments column, select the environment containing the virtual cluster where you want to create the job.
  3. In the Virtual Clusters column on the right, click the View Jobs icon on the virtual cluster where you want to create the application.
  4. In the left hand menu, click Jobs.
  5. Click the Create Job button.
  6. Provide the Job Details:
    1. Specify the Name.
    2. Click Choose file to upload your Spark application code as a JAR or Python file. If you upload a JAR file, you are prompted to specify the Main Class.
    3. Specify arguments if required. You can click the Add Argument button to add multiple command arguments as necessary.
    4. Enter Configurations if needed. You can click the Add Configuration button to add multiple configuration parameters as necessary.
    5. Click Advanced Configurations to display additional customizations, such as driver and executor cores and memory.
    6. Select whether to run the job immediately upon creation or to run on a schedule.
      You can schedule the application to run periodically using a cron expression.
  7. Click Create. If you selected Run Now, the job runs immediately.