Homepage
/
Cloudera Data Engineering
Search Documentation
▶︎
Cloudera
Reference Architectures
▼
Cloudera Public Cloud
Getting Started
Patterns
Preview Features
Data Catalog
Data Engineering
DataFlow
Data Hub
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Management Console
Operational Database
Replication Manager
DataFlow for Data Hub
Runtime
▶︎
Cloudera Private Cloud
Data Services
Getting Started
Cloudera Manager
Management Console
Replication Manager
Data Catalog
Data Engineering
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Base
Getting Started
Runtime
Upgrade
Storage
Flow Management
Streaming Analytics
Flow Management Operator
Streaming Analytics Operator
Streams Messaging Operator
▶︎
Cloudera Manager
Cloudera Manager
▶︎
Applications
Cloudera Streaming Community Edition
Data Science Workbench
Data Visualization
Edge Management
Observability SaaS
Observability on premises
Workload XM On-Prem
▶︎
Legacy
Cloudera Enterprise
Flow Management
Stream Processing
HDP
HDF
Streams Messaging Manager
Streams Replication Manager
▶︎
Getting Started
Patterns
Preview Features
Data Catalog
Data Engineering
DataFlow
Data Hub
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Management Console
Operational Database
Replication Manager
DataFlow for Data Hub
Runtime
«
Filter topics
Cloudera Data Engineering
▶︎
Top Tasks
Creating Sessions in Cloudera Data Engineering
Creating Jobs
Creating an Airflow DAG using the Pipeline UI
Scheduling Jobs
Connecting to Grafana dashboards
▶︎
Best Practices
Recommendations for scaling Cloudera Data Engineering deployments
Apache Airflow scaling and tuning considerations
Performance tuning
Best practices for building Apache Spark applications
▶︎
Release Notes
▶︎
What's new
February 10, 2025
November 12, 2024
August 27, 2024
July 23, 2024
June 24, 2024
May 31, 2024
May 22, 2024
March 27, 2024
March 20, 2024
February 27, 2024
December 11, 2023
September 14, 2023
June 29, 2023
May 18, 2023
March 30, 2023
January 19, 2023
November 23, 2022
October 12, 2022
September 26, 2022
July 20, 2022
June 30, 2022
April 27, 2022
February 09, 2022
December 21, 2021
November 9, 2021
▶︎
Older releases
October 18, 2021
August 30, 2021
August 2, 2021
June 23, 2021
May 20, 2021
April 7, 2021
March 9, 2021
February 4, 2021
December 21, 2020
November 9, 2020
September 21, 2020
July 30, 2020
Fix for CVE-2021-44228
▶︎
Known issues
General issues
Technical service bulletins
Limitations
Cloudera Data Engineering Runtime end of support
Compatibility for Cloudera Data Engineering and Runtime components
Using the Cloudera Runtime Maven repo for Cloudera Data Engineering
▶︎
Cloudera Data Engineering Overview
Cloudera Data Engineering service
Cloudera Data Engineering auto-scaling
Cloudera Data Engineering resources
▶︎
Planning
▶︎
Cloudera Data Engineering Prerequisites
▶︎
Amazon AWS prerequisites for Cloudera Data Engineering
Enabling Customer Managed Keys (CMK) on Amazon Web Services (AWS)
Azure prerequisites
Supporting Azure private storage
▶︎
Using AWS IAM restricted roles and policies for compute and Cloudera Data Engineering
Create IAM roles and instance profile pair
Create role and policy used to deploy Cloudera environments for Cloudera Data Engineering
▶︎
Cloudera Data Engineering Deployment Architecture
Recommendations for scaling Cloudera Data Engineering deployments
Apache Airflow scaling and tuning considerations
Guidelines for Virtual Cluster upkeep
Guidelines for database upkeep
▶︎
Cloudera Data Engineering Cloud Cost Management
Spot Instances
▶︎
Cloudera Data Engineering Performance Management
Performance tuning
▼
How To
▶︎
Enabling and disabling Cloudera Data Engineering
AWS Graviton instances in Cloudera Data Engineering
▶︎
Enabling a Cloudera Data Engineering service
▶︎
Enabling a fully private network for a Cloudera Data Engineering service for Azure (Tech Preview)
▶︎
Azure private DNS zones in a Cloudera Data Engineering service (Technical Preview)
Configuring an existing private DNS Zone for an AKS
Configuring an existing private DNS zone for a database
Configuring an existing private DNS zone for a Storage Account File Share
Enabling cross-subscription communication for private DNS zones
Enabling a semi-private network for a Cloudera Data Engineering service with AWS (Tech Preview)
Managing a Cloudera Data Engineering Service
Cloudera Data Engineering cluster types
Removing a Cloudera Data Engineering service
Limiting Incoming Endpoint Traffic for Cloudera Data Engineering Services For AWS
▶︎
Creating and managing virtual clusters
Creating virtual clusters
▶︎
Viewing and managing virtual cluster details
Applying user and group access for virtual clusters
Editing user and group access for virtual clusters
Cloudera Data Engineering Virtual Cluster Access Controls Troubleshooting
Managing Virtual Cluster-level Spark configurations (Technical Preview)
Editing virtual clusters
Locking and unlocking a Cloudera Data Engineering Service
Deleting virtual clusters
Accessing non-public Cloudera Data Engineering virtual clusters via SOCKS proxy
▶︎
Creating and managing Cloudera Data Engineering jobs
▶︎
Creating Jobs
Support for Spark Structured Streaming in Cloudera Data Engineering (Technical Preview)
Cloudera Data Engineering example jobs and sample data
Accessing Amazon S3 data
▶︎
Using Apache Iceberg
Prerequisites and limitations for using Iceberg
▶︎
Accessing Iceberg tables
Editing a storage handler policy to access Iceberg files on the file system
Creating a SQL policy to query an Iceberg table
Creating Virtual Cluster with Spark 3
Creating a new Iceberg table from Spark 3
Configuring Hive Metastore for Iceberg column changes
Importing and migrating Iceberg table in Spark 3
Importing and migrating Iceberg table format v2
Configuring Catalog
Loading data into an unpartitioned table
Querying data in an Iceberg table
Updating Iceberg table data
Iceberg library dependencies for Spark applications
Creating a Git repository in Cloudera Data Engineering (Technical Preview)
Managing jobs
Running jobs
Scheduling Jobs
Creating ad-hoc jobs
Configuring logging in jobs
Deleting Jobs
Best practices for building Apache Spark applications
▶︎
Creating and Managing Cloudera Data Engineering Sessions
Creating Sessions in Cloudera Data Engineering
Interacting with a Session in Cloudera Data Engineering
Connecting Sessions with the CDE CLI
▶︎
External IDE connectivity through Spark Connect-based sessions (Technical Preview)
Configuring external IDE Spark Connect sessions
Sample code to connect to an external IDE Spark Connect session
Troubleshooting errors
Viewing logs for Cloudera Data Engineering Sessions
Viewing Spark UI for Cloudera Data Engineering Sessions
▼
Orchestrating workflows and pipelines
Apache Airflow in Cloudera Data Engineering
▶︎
Creating and managing Airflow jobs using Cloudera Data Engineering
Creating Airflow jobs using Cloudera Data Engineering
Creating an Airflow DAG using the Pipeline UI
Running Airflow jobs using Cloudera Data Engineering
Deleting Airflow jobs using Cloudera Data Engineering
▼
Managing an Airflow Pipeline using the CDE CLI
▶︎
Creating a pipeline using the CDE CLI
Creating a basic Airflow pipeline using CDE CLI
Creating a pipeline with additional Airflow configurations using CDE CLI
Creating an Airflow pipeline with custom files using CDE CLI [Technical Preview]
▼
Updating a pipeline using the CDE CLI
Updating a DAG file using the CDE CLI
Updating the Airflow job configurations using the CDE CLI
Updating the Airflow file mounts using the CDE CLI [Technical Preview]
Deleting an Airflow pipeline using the CDE CLI
▶︎
Creating and managing Airflow connections
Creating a connection to Cloudera Data Warehouse or Cloudera Data Hub instance for SQL Operator
Creating a connection to Cloudera Data Warehouse for Cloudera Data Warehouse Operator
Creating a connection to run jobs on other Cloudera Data Engineering Virtual Clusters
Executing SQL queries on Cloudera Data Warehouse or Cloudera Data Hub instance using Apache Airflow in Cloudera Data Engineering
Running Airflow jobs on other Cloudera Data Engineering Virtual Clusters
Using Cloudera Data Engineering with an external Apache Airflow deployment
Supported Airflow operators and hooks
▶︎
Using custom operators and libraries for Apache Airflow
Adding custom operators and libraries
Updating custom operators and libraries
Deleting custom operators and libraries
▶︎
Troubleshooting custom operators and libraries
Viewing logs for custom operators and libraries
▶︎
Using Cloudera Data Engineering resources
▶︎
Using Python virtual environments
Creating a Python virtual environment resource
Associating a Python virtual environment with a Cloudera Data Engineering job
Associating a Python virtual environment with a Cloudera Data Engineering Session
Updating Python virtual environment resources
▶︎
Upgrading Cloudera Data Engineering
Cloudera Data Engineering upgrade version compatibility
Upgrading Cloudera Data Engineering
Upgrading to Cloudera Data Lake 7.3.1 with Cloudera Data Engineering
▶︎
In-place upgrade with Airflow Operators and Libraries
Airflow constraints file
Troubleshooting Python or Airflow compatibility issues
Debugging Airflow DAGs locally
Using the backup-restore-based upgrade script
Upgrading Airflow if DAGs and packages are compatible with new Airflow version
Handling upgrade failures for Cloudera Data Engineering
Handling ineligible upgrades for Cloudera Data Engineering
▶︎
Backing up and restoring Cloudera Data Engineering jobs
Backing up jobs on remote storage
Backing up jobs on local storage
▶︎
Restoring jobs
Restoring jobs from a 1.20.3 VC backup
▶︎
Using spark-submit drop-in migration tool
▶︎
Using spark-submit drop-in migration tool
▶︎
Installing and using the migration tool
Downloading the cde-env tool
Installing the cde-env tool
▶︎
Configuring the cde-env tool
Prerequisites for setting up the cde-env tool
Adding profile for each user and creating the Credentials file
Using the cde-env tool
Run sample spark-submit command
▶︎
Using the migration tool in a docker container
▶︎
Configuring the cde-env tool
Prerequisites for setting up the cde-env tool
Adding profile for each user and creating the Credentials file
Run the migration tool in a docker container
Run sample spark-submit command inside the docker container
Known Issues and Limitations
▶︎
Monitoring and Troubleshooting Cloudera Data Engineering
Log files
Checking the node count on your Cloud Service Provider's website
Viewing Job run timeline
Accessing Kubernetes dashboard
Connecting to Grafana dashboards
▶︎
Diagnostic bundles and summary status
Downloading a diagnostic bundle
Downloading summary status
Preflight Checks
Cloudera Data Engineering CLI exit codes
▶︎
Deep Analysis
Enable deep analysis from the Cloudera Data Engineering web UI
Enable deep analysis from the Cloudera Data Engineering CLI
Running deep analysis on a Cloudera Data Engineering job run
▶︎
Accessing Cloudera Data Engineering using the API
Using the Cloudera Data Engineering API
Getting an access token
Using an access token in API calls
▶︎
Managing job resources using the API
Creating a resource
Deleting a resource
▶︎
Using custom operators and libraries for Apache Airflow using API
Adding or updating custom operators and libraries using API
Deleting custom operators and libraries using API
▶︎
Troubleshooting custom operators and libraries
Viewing logs for custom operators and libraries using API
Cancel Maintenance using API
▶︎
Managing workload secrets using the API
Creating workload secrets
Listing workload secrets
Deleting workload secrets
Linking workload secrets with Spark Job definitions
Using the workload secret in the Spark application code
Creating a job using the API
Listing jobs using the API
Getting job info using the API
Adding a database and a Storage Account deployed in private DNS zones using the API
Managing VC-level Spark configurations using the API
▶︎
Accessing Cloudera Data Engineering using the CLI
Using the CDE CLI
Downloading the CDE CLI
▶︎
Configuring the CLI client
Cloudera Data Engineering CLI configuration options
Creating and using multiple profile files
CDE CLI authentication
CDE CLI TLS configuration
Cloudera Data Engineering concepts
▶︎
Managing job resources
Creating a resource
Uploading content to a resource
Deleting a resource
Creating Docker credentials
Deleting Docker credentials
Deleting an Airflow DAG
▶︎
Managing jobs
Creating a Spark job using the CLI
Creating an Airflow job using the CLI
Listing jobs using the CLI
Submitting a Spark job using the CLI
Running raw Scala code
Submitting an Airflow job using the CLI
Running a Spark job using the CLI
Running a Airflow job using the CLI
▶︎
Scheduling Spark jobs
Enabling, disabling, and pausing scheduled jobs
Managing the status of scheduled job instances
▶︎
Managing Sessions in Cloudera Data Engineering using the CLI
Creating a Session using the CDE CLI [Technical Preview]
Interacting with a Session using the CDE CLI
Sessions example for the CDE CLI
Sessions command descriptions
Cloudera Data Engineering Spark job example
CDE CLI command reference
CDE CLI Spark flag reference
CDE CLI Airflow flag reference
CDE CLI list command syntax reference
▶︎
Managing workload secrets using the CLI
Creating workload secrets
Updating workload secrets
Linking a workload secrets
Using workload secrets
Listing workload secrets
Deleting workload secrets
Jobs REST API reference
Accessing Amazon S3 data
Accessing Cloudera Data Engineering using the API
Accessing Cloudera Data Engineering using the CLI
Accessing Iceberg tables
Accessing Kubernetes dashboard
Accessing non-public Cloudera Data Engineering virtual clusters via SOCKS proxy
Adding a database and a Storage Account deployed in private DNS zones using the API
Adding custom operators and libraries
Adding or updating custom operators and libraries using API
Adding profile for each user and creating the Credentials file
Adding profile for each user and creating the Credentials file
Airflow constraints file
Amazon AWS prerequisites for Cloudera Data Engineering
Apache Airflow in Cloudera Data Engineering
Apache Airflow scaling and tuning considerations
Apache Airflow scaling and tuning considerations
Applying user and group access for virtual clusters
April 27, 2022
April 7, 2021
Associating a Python virtual environment with a Cloudera Data Engineering job
Associating a Python virtual environment with a Cloudera Data Engineering Session
August 2, 2021
August 27, 2024
August 30, 2021
AWS Graviton instances in Cloudera Data Engineering
Azure prerequisites
Azure private DNS zones in a Cloudera Data Engineering service (Technical Preview)
Backing up and restoring Cloudera Data Engineering jobs
Backing up jobs on local storage
Backing up jobs on remote storage
Best Practices
Best practices for building Apache Spark applications
Best practices for building Apache Spark applications
Cancel Maintenance using API
CDE CLI Airflow flag reference
CDE CLI authentication
CDE CLI command reference
CDE CLI list command syntax reference
CDE CLI Spark flag reference
CDE CLI TLS configuration
Checking the node count on your Cloud Service Provider's website
Cloudera Data Engineering
Cloudera Data Engineering auto-scaling
Cloudera Data Engineering CLI configuration options
Cloudera Data Engineering CLI exit codes
Cloudera Data Engineering Cloud Cost Management
Cloudera Data Engineering cluster types
Cloudera Data Engineering concepts
Cloudera Data Engineering Deployment Architecture
Cloudera Data Engineering example jobs and sample data
Cloudera Data Engineering Overview
Cloudera Data Engineering Performance Management
Cloudera Data Engineering Prerequisites
Cloudera Data Engineering resources
Cloudera Data Engineering Runtime end of support
Cloudera Data Engineering service
Cloudera Data Engineering Spark job example
Cloudera Data Engineering upgrade version compatibility
Cloudera Data Engineering Virtual Cluster Access Controls Troubleshooting
Compatibility for Cloudera Data Engineering and Runtime components
Configuring an existing private DNS zone for a database
Configuring an existing private DNS zone for a Storage Account File Share
Configuring an existing private DNS Zone for an AKS
Configuring Catalog
Configuring external IDE Spark Connect sessions
Configuring Hive Metastore for Iceberg column changes
Configuring logging in jobs
Configuring the cde-env tool
Configuring the cde-env tool
Configuring the CLI client
Connecting Sessions with the CDE CLI
Connecting to Grafana dashboards
Connecting to Grafana dashboards
Create IAM roles and instance profile pair
Create role and policy used to deploy Cloudera environments for Cloudera Data Engineering
Creating a basic Airflow pipeline using CDE CLI
Creating a connection to Cloudera Data Warehouse for Cloudera Data Warehouse Operator
Creating a connection to Cloudera Data Warehouse or Cloudera Data Hub instance for SQL Operator
Creating a connection to run jobs on other Cloudera Data Engineering Virtual Clusters
Creating a Git repository in Cloudera Data Engineering (Technical Preview)
Creating a job using the API
Creating a new Iceberg table from Spark 3
Creating a pipeline using the CDE CLI
Creating a pipeline with additional Airflow configurations using CDE CLI
Creating a Python virtual environment resource
Creating a resource
Creating a resource
Creating a Session using the CDE CLI [Technical Preview]
Creating a Spark job using the CLI
Creating a SQL policy to query an Iceberg table
Creating ad-hoc jobs
Creating Airflow jobs using Cloudera Data Engineering
Creating an Airflow DAG using the Pipeline UI
Creating an Airflow DAG using the Pipeline UI
Creating an Airflow job using the CLI
Creating an Airflow pipeline with custom files using CDE CLI [Technical Preview]
Creating and managing Airflow connections
Creating and managing Airflow jobs using Cloudera Data Engineering
Creating and managing Cloudera Data Engineering jobs
Creating and Managing Cloudera Data Engineering Sessions
Creating and managing virtual clusters
Creating and using multiple profile files
Creating Docker credentials
Creating Jobs
Creating Jobs
Creating Sessions in Cloudera Data Engineering
Creating Sessions in Cloudera Data Engineering
Creating Virtual Cluster with Spark 3
Creating virtual clusters
Creating workload secrets
Creating workload secrets
Debugging Airflow DAGs locally
December 11, 2023
December 21, 2020
December 21, 2021
Deep Analysis
Deleting a resource
Deleting a resource
Deleting Airflow jobs using Cloudera Data Engineering
Deleting an Airflow DAG
Deleting an Airflow pipeline using the CDE CLI
Deleting custom operators and libraries
Deleting custom operators and libraries using API
Deleting Docker credentials
Deleting Jobs
Deleting virtual clusters
Deleting workload secrets
Deleting workload secrets
Diagnostic bundles and summary status
Downloading a diagnostic bundle
Downloading summary status
Downloading the CDE CLI
Downloading the cde-env tool
Editing a storage handler policy to access Iceberg files on the file system
Editing user and group access for virtual clusters
Editing virtual clusters
Enable deep analysis from the Cloudera Data Engineering CLI
Enable deep analysis from the Cloudera Data Engineering web UI
Enabling a Cloudera Data Engineering service
Enabling a fully private network for a Cloudera Data Engineering service for Azure (Tech Preview)
Enabling a semi-private network for a Cloudera Data Engineering service with AWS (Tech Preview)
Enabling and disabling Cloudera Data Engineering
Enabling cross-subscription communication for private DNS zones
Enabling Customer Managed Keys (CMK) on Amazon Web Services (AWS)
Enabling, disabling, and pausing scheduled jobs
Executing SQL queries on Cloudera Data Warehouse or Cloudera Data Hub instance using Apache Airflow in Cloudera Data Engineering
External IDE connectivity through Spark Connect-based sessions (Technical Preview)
February 09, 2022
February 10, 2025
February 27, 2024
February 4, 2021
Fix for CVE-2021-44228
General issues
Getting an access token
Getting job info using the API
Guidelines for database upkeep
Guidelines for Virtual Cluster upkeep
Handling ineligible upgrades for Cloudera Data Engineering
Handling upgrade failures for Cloudera Data Engineering
Iceberg library dependencies for Spark applications
Importing and migrating Iceberg table format v2
Importing and migrating Iceberg table in Spark 3
In-place upgrade with Airflow Operators and Libraries
Installing and using the migration tool
Installing the cde-env tool
Interacting with a Session in Cloudera Data Engineering
Interacting with a Session using the CDE CLI
January 19, 2023
July 20, 2022
July 23, 2024
July 30, 2020
June 23, 2021
June 24, 2024
June 29, 2023
June 30, 2022
Known issues
Known Issues and Limitations
Limitations
Limiting Incoming Endpoint Traffic for Cloudera Data Engineering Services For AWS
Linking a workload secrets
Linking workload secrets with Spark Job definitions
Listing jobs using the API
Listing jobs using the CLI
Listing workload secrets
Listing workload secrets
Loading data into an unpartitioned table
Locking and unlocking a Cloudera Data Engineering Service
Log files
Managing a Cloudera Data Engineering Service
Managing an Airflow Pipeline using the CDE CLI
Managing job resources
Managing job resources using the API
Managing jobs
Managing jobs
Managing Sessions in Cloudera Data Engineering using the CLI
Managing the status of scheduled job instances
Managing VC-level Spark configurations using the API
Managing Virtual Cluster-level Spark configurations (Technical Preview)
Managing workload secrets using the API
Managing workload secrets using the CLI
March 20, 2024
March 27, 2024
March 30, 2023
March 9, 2021
May 18, 2023
May 20, 2021
May 22, 2024
May 31, 2024
Monitoring and Troubleshooting Cloudera Data Engineering
November 12, 2024
November 23, 2022
November 9, 2020
November 9, 2021
October 12, 2022
October 18, 2021
Older releases
Orchestrating workflows and pipelines
Performance tuning
Performance tuning
Preflight Checks
Prerequisites and limitations for using Iceberg
Prerequisites for setting up the cde-env tool
Prerequisites for setting up the cde-env tool
Querying data in an Iceberg table
Recommendations for scaling Cloudera Data Engineering deployments
Recommendations for scaling Cloudera Data Engineering deployments
Release Notes
Removing a Cloudera Data Engineering service
Restoring jobs
Restoring jobs from a 1.20.3 VC backup
Run sample spark-submit command
Run sample spark-submit command inside the docker container
Run the migration tool in a docker container
Running a Airflow job using the CLI
Running a Spark job using the CLI
Running Airflow jobs on other Cloudera Data Engineering Virtual Clusters
Running Airflow jobs using Cloudera Data Engineering
Running deep analysis on a Cloudera Data Engineering job run
Running jobs
Running raw Scala code
Sample code to connect to an external IDE Spark Connect session
Scheduling Jobs
Scheduling Jobs
Scheduling Spark jobs
September 14, 2023
September 21, 2020
September 26, 2022
Sessions command descriptions
Sessions example for the CDE CLI
Spot Instances
Submitting a Spark job using the CLI
Submitting an Airflow job using the CLI
Support for Spark Structured Streaming in Cloudera Data Engineering (Technical Preview)
Supported Airflow operators and hooks
Supporting Azure private storage
Technical service bulletins
Top Tasks
Troubleshooting custom operators and libraries
Troubleshooting custom operators and libraries
Troubleshooting errors
Troubleshooting Python or Airflow compatibility issues
Updating a DAG file using the CDE CLI
Updating a pipeline using the CDE CLI
Updating custom operators and libraries
Updating Iceberg table data
Updating Python virtual environment resources
Updating the Airflow file mounts using the CDE CLI [Technical Preview]
Updating the Airflow job configurations using the CDE CLI
Updating workload secrets
Upgrading Airflow if DAGs and packages are compatible with new Airflow version
Upgrading Cloudera Data Engineering
Upgrading Cloudera Data Engineering
Upgrading to Cloudera Data Lake 7.3.1 with Cloudera Data Engineering
Uploading content to a resource
Using an access token in API calls
Using Apache Iceberg
Using AWS IAM restricted roles and policies for compute and Cloudera Data Engineering
Using Cloudera Data Engineering resources
Using Cloudera Data Engineering with an external Apache Airflow deployment
Using custom operators and libraries for Apache Airflow
Using custom operators and libraries for Apache Airflow using API
Using Python virtual environments
Using spark-submit drop-in migration tool
Using spark-submit drop-in migration tool
Using the backup-restore-based upgrade script
Using the CDE CLI
Using the cde-env tool
Using the Cloudera Data Engineering API
Using the Cloudera Runtime Maven repo for Cloudera Data Engineering
Using the migration tool in a docker container
Using the workload secret in the Spark application code
Using workload secrets
Viewing and managing virtual cluster details
Viewing Job run timeline
Viewing logs for Cloudera Data Engineering Sessions
Viewing logs for custom operators and libraries
Viewing logs for custom operators and libraries using API
Viewing Spark UI for Cloudera Data Engineering Sessions
What's new
«
Filter topics
Updating a pipeline using the CDE CLI
Apache Airflow in Cloudera Data Engineering
▶︎
Creating and managing Airflow jobs using Cloudera Data Engineering
Creating Airflow jobs using Cloudera Data Engineering
Creating an Airflow DAG using the Pipeline UI
Running Airflow jobs using Cloudera Data Engineering
Deleting Airflow jobs using Cloudera Data Engineering
▼
Managing an Airflow Pipeline using the CDE CLI
▶︎
Creating a pipeline using the CDE CLI
Creating a basic Airflow pipeline using CDE CLI
Creating a pipeline with additional Airflow configurations using CDE CLI
Creating an Airflow pipeline with custom files using CDE CLI [Technical Preview]
▼
Updating a pipeline using the CDE CLI
Updating a DAG file using the CDE CLI
Updating the Airflow job configurations using the CDE CLI
Updating the Airflow file mounts using the CDE CLI [Technical Preview]
Deleting an Airflow pipeline using the CDE CLI
▶︎
Creating and managing Airflow connections
Creating a connection to Cloudera Data Warehouse or Cloudera Data Hub instance for SQL Operator
Creating a connection to Cloudera Data Warehouse for Cloudera Data Warehouse Operator
Creating a connection to run jobs on other Cloudera Data Engineering Virtual Clusters
Executing SQL queries on Cloudera Data Warehouse or Cloudera Data Hub instance using Apache Airflow in Cloudera Data Engineering
Running Airflow jobs on other Cloudera Data Engineering Virtual Clusters
Using Cloudera Data Engineering with an external Apache Airflow deployment
Supported Airflow operators and hooks
▶︎
Using custom operators and libraries for Apache Airflow
Adding custom operators and libraries
Updating custom operators and libraries
Deleting custom operators and libraries
▶︎
Troubleshooting custom operators and libraries
Viewing logs for custom operators and libraries
»
Orchestrating workflows and pipelines
Updating a pipeline using the CDE CLI
You can update the following properties in an Airflow pipeline:
Updating a DAG file using the CDE CLI
You can update a Directed Acyclic Graph (DAG) file using the CDE CLI for instances where the DAG needs to be overridden. For use cases where the DAG needs to be overridden, first the DAG needs to be uploaded to the resource to override the previous version, then you must update the job.
Updating the Airflow job configurations using the CDE CLI
In the case where the Airflow job was created with the --config option, the Airflow job configuration can be updated with the following command below. For more information, see Creating a pipeline using the CDE CLI linked below.
Updating the Airflow file mounts using the CDE CLI [Technical Preview]
You can update or delete an existing file mount, or add new Airflow file mounts for your pipeline with these commands.
Parent topic:
Managing an Airflow Pipeline using the CDE CLI