March 12, 2026

This release (1.25.2) of the Cloudera Data Engineering service on Cloudera on cloud introduces the following changes:

Job-level Instance Type Override [Technical Preview]

Spot-instance settings are the default settings at Virtual Cluster (VC) level, however, to enhance flexibility, you have the option to overwrite the defaults on a per-job basis. The Job-level Instance Type Override feature is available in the Unified Jobs UI, but not in the Legacy Jobs UI.

  • You can access the Unified Jobs UI by selecting Jobs from the left navigation menu of the Cloudera Data Engineering UI.
  • You can access the Legacy Jobs UI from the Virtual Cluster Details option by clicking the View Jobs button.

If you disable the Job-level Instance Type Override feature, the compute-configuration of the job reverts to the VC-level setting, and becomes read-only at the job level.

If you enable the the Job-level Instance Type Override feature, you can perform the following actions:

  • For Spark, you can override the default virtual cluster settings to choose specific node types for Spark drivers and executors.
  • For Airflow, you can override the default virtual cluster settings to choose specific node types for Airflow workers.

For more information, see:

Virtual cluster-level suspend and resume available for GA

The Virtual cluster-level suspend and resume feature, previously available as a technical preview, has now reached General Availability (GA).

Enhancements introduced for virtual cluster-level suspend and resume:

  • The functionality to suspend and resume Cloudera Data Engineering Virtual Clusters (VCs) is now supported directly within the Cloudera Data Engineering UI. This capability was previously accessible only through the Cloudera Data Engineering API or the CDP CLI.
  • From Cloudera Data Engineering 1.25.2, you have the option to simultaneously suspend multiple Cloudera Data Engineering VCs, eliminating the previous restriction that required VCs to be suspended sequentially.

Switching from Azure AD (AAD) Pod Identity to Workload identity

In Cloudera Data Engineering 1.25.2, Workload Identity replaces the Azure AD Pod Identity (aad-pod-identity) component used for some of the workloads in Cloudera Data Engineering to pull logger credentials.

The Workload Identity component is more secure, provides faster startup times with better scaling and enables you to use more granular permissions.

Workload Identity requires users to provision two new user-assigned Managed Identities for each new Cloudera Data Engineering service.

Key prerequisites include updating environment credentials with a custom role to manage Federated Identity Credentials (FIC) and ensuring all new Managed Identities have the Storage Blob Data Contributor role for the logs container.

In-place upgrade operations require patching the existing Cloudera Data Engineering service to update Managed Identities through the Cloudera Data Engineering UI or the patchCluster API.

Cloudera Data Engineering service-level backup and restore operations require overriding existing Managed Identities with new ones through CLI options.

For more information, see Switching from Azure AD Pod Identity (aad-pod-identity) to Workload Identity in Azure clusters.

Graviton-based database instances are now the default for Cloudera Data Engineering services on AWS

Starting with Cloudera Data Engineering 1.25.2, the database instances for Cloudera Data Engineering services on AWS are provisioned on Graviton instances by default to save on cloud costs. If a Graviton instance is not available, the service will fall back to an x86 instance. To explicitly use an x86 instance when creating a Cloudera Data Engineering service, you must set the following configuration using the Cloudera Data Engineering API: config.database.disable_arm64=true

When upgrading an existing service to Cloudera Data Engineering 1.25.2, the associated database instances are automatically upgraded to Graviton.

Python Environment support introduced on the Cloudera Data Engineering UI for External IDE connectivity through Spark Connect-based sessions [Technical Preview]

On the Cloudera Data Engineering UI, you now have the option to select the Python Environment for External IDE connectivity through Spark Connect-based sessions.

For more information, see Configuring external IDE Spark Connect sessions.

Cloudera Data Engineering support for Atlas Lineage for Spark Iceberg tables

You can now generate Atlas Lineage for Spark Iceberg tables interactively using Cloudera Data Engineering sessions. This brings lineage tracking to your exploratory and interactive Spark workloads, allowing you to capture table creation and insertion events in Apache Atlas, just as you do with Cloudera Data Engineering jobs.

For a list of supported Spark SQL creation patterns and current limitations, see Cloudera Data Engineering support for Atlas Lineage for Spark Iceberg tables.

Enhanced scaling range for Cloudera Data Engineering services

The maximum limit of the Autoscaling Range parameter has been extended, allowing a Cloudera Data Engineering service to scale up to 250 nodes.

For more information, see Cloudera Data Engineering auto-scaling.

Kubernetes version upgrade to 1.33

The Kubernetes version that Cloudera Data Engineering uses is upgraded to Kubernetes 1.33.

For more information, see Compatibility for Cloudera Data Engineering and Runtime components.