November 20, 2025

This release (1.25.0) of the Cloudera Data Engineering service on Cloudera on cloud introduces the following changes.

Virtual Cluster-level suspend and resume [Technical Preview]

From Cloudera Data Engineering version 1.25.0, the Virtual Cluster-level suspend and resume feature is available, enabling you to temporarily pause one or more Cloudera Data Engineering Virtual Clusters during idle periods. This feature is supported on both AWS and Azure. Suspending a larger number of VCs can offer significant cost savings, primarily on the compute resources.

To start using the feature, you need to request the DE_ENABLE_VC_SUSPEND_RESUME entitlement.

For more information, see Overview of suspending and resuming Cloudera Data Engineering virtual clusters.

Cloudera Data Engineering Spark support for Hive Warehouse Connector

Cloudera Data Engineering offers the integration with Hive Warehouse Connector (HWC) to access and manage Hive-managed tables from Cloudera Data Engineering Spark. The HWC dependencies are built into Cloudera Data Engineering.

Through Cloudera Data Engineering Spark, you can use HWC to read data from Hive and to write data to Hive. HWC enables secure access and querying of Hive-managed tables directly from Apache Spark. To ensure secure data handling, HWC enables you to manage data access with fine-grained access control.

For more information, see Cloudera Data Engineering Spark support for Hive Warehouse Connector.

Amazon Linux distribution migration to AL2023

The OS image has been changed from AL2 to AL2023 to help you increase security and improve application performance. AL2023 optimizes boot times to decrease the interval from instance initialization to processing the customer workload. For more information, see Comparing AL2 and AL2023.

Cloudera Data Engineering performance improvement

NodeLocalDNS is a Kubernetes-supported DaemonSet that runs a local DNS cache on every node in the cluster. The deployment of NodeLocalDNS in Cloudera Data Engineering improves application reliability, as service discovery depends on DNS. Currently, NodeLocalDNS is supported for Cloudera Data Engineering services on AWS. Key benefits of deploying NodeLocalDNS:

  • Reduced latency: queries are answered from a local cache, minimizing network hops
  • Lower CoreDNS load: repeated queries are served locally, reducing load on CoreDNS
  • Improved stability: cached entries remain available even if the CoreDNS pods restart

Cloudera Data Engineering system resilience improvement

The system resilience of Cloudera Data Engineering has been increased by enhancing the high availability capabilities of the Admission Controller to help make the system more robust in unreliable networking environments.

Upgrade precheck updates

As part of the upgrade preparation workflow, a check has been introduced to ensure that the Virtual Clusters (VCs) use a Spark version that is compatible with the Data Lake version. If the Spark version used for the VCs is incompatible with the Data Lake version, the upgrade preparation fails. For example, if the Data Lake version is upgraded from 7.2.18 to 7.3.1, and any of the VCs use a Spark version lower than 3.5.4, the upgrade preparation fails.

Kubernetes version upgrade to 1.32

The Kubernetes version that Cloudera Data Engineering uses is upgraded to Kubernetes 1.32. For more information, see Compatibility for Cloudera Data Engineering and Runtime components.

Apache Airflow version upgrade to 2.11.0

The Airflow version that Cloudera Data Engineering uses is upgraded to Airflow 2.11.0 for improved security and stability. For more information, see:

Fixed issues

  • DEX-18145: Upgrade dbus-wxm-client version to 1.6.0-b7
  • DEX-17969: Airflow failed to connect to Impala in Cloudera Data Engineering
  • DEX-17565: Links to download cdeconnect and pyspark tars for Spark Connect are giving HTTP 404 error
  • DEX-16414: Sessions GET endpoint not returning empty array
  • DEX-15461: Writing Spark Dataframe to Hive using HWC Fails with java.util.NoSuchElementException: None.get
  • DEX-17973: Implement NodeLocalDNS to improve DNS performance on deployed clusters
  • DEX-17940: Scheduled Airflow DAG run syncing to Cloudera Data Engineering can miss runs
  • DEX-17865: TZData package missing in chainguard image
  • DEX-17806: Handle unavailability of the admission controller
  • DEX-17760: Investigate the causes for intermittent RocksDBException in SHS pods
  • DEX-17284: Gracefully handle quota reached situation
  • DEX-17269: Upgrade tgtgenerator version to 1.0.3-b131 and testing basic jobs
  • DEX-17260: Restoring manually packed archive failed with "could not find file entry"
  • DEX-17037: Jobs with python script referencing resources on airflow editor page failed to save
  • DEX-16747: Cloudera Data Engineering 1.23.1-b114 - Driver container stderr, and stdout logs are missing for some Spark jobs
  • DEX-16588: Cloudera Data Engineering in-place upgrade fails with "EKS API server endpoint"
  • DEX-16100: RunDAGMonitor may create duplicate Cloudera Data Engineering Jobs
  • DEX-16048: Speed up Airflow Cloudera Data Engineering Job update
  • DEX-15266: Get rid of Cluster#State check in the Cloudera Data Engineering code
  • DEX-14276: Restrict CAP_NET_RAW permissions
  • DEX-11181: Steps To Edit MountOptions Of Storage Class
  • DEX-7390: Add support for backup/restore of workload credentials