July 20, 2022

This release (1.16) of the Cloudera Data Engineering (CDE) service on CDP Public Cloud introduces the new features and improvements that are described in this topic.

Airflow pipeline UI editor (GA)

  • Airflow Pipeline UI editor is now GA as a default feature in new Virtual Clusters withsupport for all major browsers (Firefox, Chrome, and Safari).

Upgrade to Airflow 2.2.5

CDE 1.16 now runs with Airflow 2.2.5.

  • Several fixes to improve performance and stability have been bundled with the upgrade.
  • New Virtual Clusters will automatically use the new Airflow version.
  • This version deprecated the timezone package usage. The DAGs need to be updated to use the pendulum package instead. If your airflow DAGs need to be timezone aware then they should rely on the pendulum timezone library for start and end dates as described here. Otherwise, the backup and restore process will not be able to restore these DAGs. For more information, see CDE known issues.

Spark 3 support for raw scala code

Spark 3 support for raw scala code.

Previously this feature was limited to Spark 2, it is now extended to Spark 3 based Virtual Clusters. This allows you to directly run raw scala via API & CLI in batch-mode without having to compile, similar to what spark-shell supports.

Support for Azure private storage

CDE now supports Azure private storage. Both private ABFS and ADLS gen2 containers are now supported.

Editing VC configurations post creation

You can now modify the virtual settings such as cluster quotas (CPU/memory) dynamically.

Loading example jobs and sample data using new VCs

CDE provides an option to add in-product examples of data & jobs in new virtual clusters to facilitate smoother onboarding and learning for new customers.

Kubernetes update

CDE now supports K8s 1.22.

Support for creation of a Default Virtual Cluster

CDE now provides support for default virtual clusters. This will help you get a jump start to create your jobs easily, without having to wait to create a CDE virtual cluster, making the onboarding smoother. You have the option to turn this selection off if you do not wish to use a default virtual cluster.

For more information, see Enabling Cloudera Data Engineering service.

[Technical Preview] In-place upgrades

CDE supports upgrades from CDE 1.14 on both AWS and Azure. The upgrades can be triggered by an Admin from CDE UI.

  • Users will need to manually pause/backup/restore each Virtual Cluster to account for upgrade failures.
  • Upgrades of CDE core components include: EKS, AKS Services, and Application Services
  • Upgrades of dependencies include: Helm, K8s versions, YuniKorn