July 20, 2022
This release (1.16) of the Cloudera Data Engineering (CDE) service on CDP Public Cloud introduces the new features and improvements that are described in this topic.
Airflow pipeline UI editor (GA)
- Airflow Pipeline UI editor is now GA as a default feature in new Virtual Clusters with support for all major browsers (Firefox, Chrome, and Safari).
Upgrade to Airflow 2.2.5
CDE 1.16 now runs with Airflow 2.2.5.
- Several fixes to improve performance and stability have been bundled with the upgrade.
- New Virtual Clusters will automatically use the new Airflow version.
- This version deprecated the timezone package usage. The DAGs need to be updated to use the pendulum package instead. If your airflow DAGs need to be timezone aware then they should rely on the pendulum timezone library for start and end dates as described here. Otherwise, the backup and restore process will not be able to restore these DAGs. For more information, see CDE known issues.
Spark 3 support for raw scala code
Spark 3 support for raw scala code.
Previously this feature was limited to Spark 2, it is now extended to Spark 3 based Virtual Clusters. This allows you to directly run raw scala via API & CLI in batch-mode without having to compile, similar to what spark-shell supports.
Support for Azure private storage
CDE now supports Azure private storage. Both private ABFS and ADLS gen2 containers are now supported.
Editing VC configurations post creation
You can now modify the virtual settings such as cluster quotas (CPU/memory) dynamically.
Loading example jobs and sample data using new VCs
CDE provides an option to add in-product examples of data & jobs in new virtual clusters to facilitate smoother onboarding and learning for new customers.
CDE now supports K8s 1.22.
The CSP EOS for K8s 1.21 is as follows:
For Azure: July 2022
For AWS: February 2023
- Check for removals as per this upgrade:
Kubernetes API and Feature Removals In 1.22 and Removed APIs by release
Support for creation of a Default Virtual Cluster
CDE now provides support for default virtual clusters. This will help you get a jump start to create your jobs easily, without having to wait to create a CDE virtual cluster, making the onboarding smoother. You have the option to turn this selection off if you do not wish to use a default virtual cluster.
For more information, see Enabling Cloudera Data Engineering service.
[Technical Preview] In-place upgrades
CDE supports upgrades from two CDE versions prior for both AWS and Azure. For example, if the current CDE version is 1.18, then upgrades are supported from CDE 1.16. The upgrades can be triggered by an Admin from CDE UI.
- Users will need to manually pause/backup/restore each Virtual Cluster to account for upgrade failures.
- Upgrades of CDE core components include: EKS, AKS Services, and Application Services
- Upgrades of dependencies include: Helm, K8s versions, YuniKorn