Scheduling jobs in Cloudera Data Engineering

Jobs in Cloudera Data Engineering (CDE) can be run on demand, or scheduled to run on an ongoing basis. The following instructions demonstrate how to create or modify a schedule for an existing job.

  1. In the Cloudera Data Platform (CDP) console, click the Data Engineering tile and click Overview.
  2. In the CDE Services column, select the environment containing the virtual cluster where you want to schedule the job.
  3. In the Virtual Clusters column on the right, click the View Jobs icon on the virtual cluster containing the job you want to schedule.
  4. Click the Configure.
  5. Click the Advanced Configurations link at the bottom of the page to view additional configuration parameters.
  6. Click the Actions menu next to the application, and then click Configuration.
  7. Select the Schedule toggle, and then set the Start time, End time, and Cron expression.
    The start and end times designate the time frame for which the schedule is active. The Cron expression uses the cron scheduling syntax to specify when the application should run within the start and end times. For information and examples of the cron syntax, see the Cron entry on Wikipedia.
  8. If you want to start a job immediately, check the Start job box.
  9. Click Update to save your changes.
  10. Select optional scheduling configurations:
    1. Select Enable Catchup to kick off job runs for any data interval that has not been run since the last data interval. If this option is not selected, only the runs that start after the time that the job was created will be included.
    2. Select Depends on Previous to ensure that each job run is preceeded by a successful job run.
  11. Click Schedule.