Creating an ad-hoc job in Cloudera Data Engineering
Ad-hoc runs mimic the behavior of the traditional spark-submit or single execution of an Airflow DAG, where the job runs once. These runs will not establish a permanent job definition. You can use the ad-hoc job runs for log analysis and future reference.
Before you begin
- Ensure that you have a Virtual Cluster that is ready to use.
- In the Cloudera Data Platform (CDP) console, click the Data Engineering tile. The Home page displays.
- In the Jobs section under Spark, click Ad-hoc Run.
- Select a Virtual Cluster.
- Enter a Job Name.
- Upload an Application File or enter the Application File’s External URL.
- Enter a Main Class.
- Enter Arguments and Configurations.
- Select a Python Environment
- In the Cloudera Data Platform (CDP) console, click the Data Engineering tile. The Home page displays.
- In the Jobs section under Airflow, and click Ad-hoc Run.
- Select a Virtual Cluster.
- Enter a Job Name.
- Upload a DAG file.
- Click Create and Run.