Creating an ad-hoc job in Cloudera Data Engineering
Ad-hoc runs mimic the behavior of the traditional spark-submit or a single execution of an Airflow DAG, where the job runs once. These runs will not establish a permanent job definition. You can use the ad-hoc job runs for log analysis and future reference.
Before you begin
- Ensure that you have a Virtual Cluster that is ready to use.
- In the Cloudera Data Platform (CDP) console, click the Data Engineering tile. The Home page displays.
- In the Jobs section under Spark, click Ad-hoc Run.
- Select a Virtual Cluster.
- Enter a Job Name.
- Upload an Application File or enter the Application File’s External URL.
- Enter a Main Class.
- Enter Arguments and Configurations.
- Select a Python Environment
You can upload additional files, customize the number of executors, drivers, executor cores, and memory.
- Upload files and resources.
- Configure Compute Options.
- Set an option for Log Level.
- Click Create and Run.