Cloudera on Cloud: February 2025 Release Summary
The Release Summary of Cloudera on cloud summarizes major features introduced in Cloudera Management Console, Cloudera Data Hub, and data services.
Cloudera AI
Cloudera AI 2.0.47-b359 introduces the following changes:
New features/improvements
Cloudera AI Platform
- We have improved the synchronization efficiency and ease of use of the user management and team management auto synchronization features. The major updates include:
- Auto-synchronization is enabled by default: Auto synchronization for users and teams is now enabled by default, with a synchronization interval set to 12 hours.
- User management service: User management is now handled by a new service, reducing overhead on the web pod. It now prevents multiple synchronization operations from running in parallel.
- Logging: Detailed logging has been added for the failure cases.
- Synchronization trigger sequence: The team synchronization now internally triggers user synchronization to pull the most recent user details from the Cloudera control plane.
- These improvements are aimed at optimizing performance and streamlining the synchronization process for users and teams. (DSE-37941)
- We have added support to set maximum input/output operations per second (IOPS) and throughput for root volumes attached to worker nodes, using the UI while provisioning a workbench. Note, that this is supported only for AWS. For more details on how to Maximize IOPS and throughput of the root volumes, see Provisioning Cloudera AI Workbenches. (DSE-42075)
Cloudera AI Registry
- You can now specify subnets for load balancers when creating the AI Registry. (DSE-42156)
- We have enhanced the security of the AI Registry’s search capability. (DSE-41740)
Cloudera AI Inference service
- We have improved the UI usability of the Hugging Face import feature by adding a tooltip example. (DSE-41926)
Fixed Issues
Cloudera AI Workbench
- We have increased Grafana pod’s default memory and CPU to prevent from out of memory (OOM) errors. (DSE-39525)
- We have increased the Remote Procedure Call (GRPC) Operator timeout to two minutes to prevent from errors encountered with 150 concurrent sessions. (DSE-36922)
- We have removed unessential calls to the
usage
API to resolve slowness during new workload creation under heavy load in a workbench. (DSE-42231)
Cloudera AI Platform
- We have optimized the Suspend timeout during periods of high network latency. (DSE-42055)
- Previously, when restoring a workbench with a very large Elastic File System (EFS) drive was failing due to session time out. This issue is now resolved. (DSE-42171)
Cloudera AI Registry
- We have fixed an issue that prevented from model registration to the AI Registry within a workbench. (DSE-42360)
- We have fixed a page token issue that prevented users from viewing AI Registry models on subsequent pages within the workbench. (DSE-42379)
- We have fixed an incorrect error message displayed in the UI when deleting AI Registry models from within a workbench. (DSE-42379) Error visibility has been improved during AI Registry backup. (DSE-42163)
Cloudera AI Inference service
- We have fixed an issue that prevented from rendering TPOT (Time per Output Token) and TTFT (Time to First Token) charts for Hugging Face models. (DSE-42192)
ML Runtimes
- Previously, non-administrator users were unable to add new Runtimes to the Runtime Catalog. This issue is now resolved. (DSE-42298)
Cloudera AI 2.0.47-b345 introduces the following changes:
New Features / Improvements
Cloudera AI Workbench
- Support is now provided for API keys to invoke applications deployed using Cloudera AI Workbenches. This not only eases the invocation of those applications programmatically but also allows one application to easily invoke another application that they have access to.
- MLFLOW upgrade for Cloudera AI Workbenches now enables making use of the latest offerings and APIs from the MLFLOW community like evaluateLLM.
Cloudera AI Platform
- The autoscaling range of Suspend Workflow is now set to the value 0 to ensure that other Kubernetes deployments outside the scope of MLX can deploy their pods on worker nodes.
Cloudera AI Registry
- An enhanced error message is now displayed during model upload failure.
- UI for Registered Models displays the environment name of the registry along with an error message when any user is unable to access any Cloudera AI Registry.
- A checkbox is now added to enable Public Load Balancer for new Cloudera AI Registries on Azure.
Cloudera AI Inference service
- The Hugging Face model server backend has been upgraded, which expands the compatibility with a larger number of model families, such as Llama 3.3 and models derived from it.
- Llama 3.2 Vision Language Model NIM version has been updated to address compatibility with A10G (g5.) and L40S (g6e.) GPU instances on AWS.
- You can now upgrade Cloudera AI Inference service using the UI. Previously, the upgrade was supported only using CDPCLI.
- You can now upgrade from Cloudera AI Inference service version 1.2.0-b80 to version 1.3.0-b113 or higher. Note that you cannot upgrade from 1.3.0-b111 to 1.3.0-b113 or higher. For more information on the 1.3.0-b111 upgrade issue and workaround, see the Known Issues section.
Fixed Issues
Cloudera AI Workbench
- Previously, due to an issue, users could stop sessions under projects that they were not authorized to access using the session’s UUID. This issue is now resolved. (DSE-39798)
- Previously, when a Kubernetes object was deleted, and the reconciler was overwhelmed by a large number of events, the
Deleted
status failed to propagate properly. This issue is now resolved. (DSE-41431) - Previously, the stopped_at column was not correctly populated when applications were stopped. This issue is now resolved. (DSE-41636)
- Previously, engine pods were stuck in the
Init:StartError state
and you had to manually delete it. With this fix, pods stuck inInit:StartError
in the Garbage Collection will be deleted after a certain grace period. (DSE-41430) - Previously, Spark environment configurations were not inherited by models running Spark. With this fix, models use the appropriate Spark configurations to run Spark. (DSE-36343)
Cloudera AI Registry
- An issue around how Hugging Face token was being consumed during the import of a model was addressed. (DSE-41714)
- The Cloudera AI Registry deletion flow is improved to take care of race conditions when both creation and deletion are triggered in a short frame of time. (DSE-41634)
Cloudera AI Inference service
- Previously, the GetEndpointLogs failed with an error. With this fix, endpoint logs for the model container do not exceed the gRPC messaging size. (DSE-41765)
- A new field called
loadBalancerIPWhitelists
is added to display a list of IPs whitelisted for the load balancer and deprecatedisPublic
andipAllowlist
. (DSE-39397) - Infrastructure nodes are no longer shown as instances that can be used for deploying a new endpoint. (DSE-41726)
ML Runtimes
- Previously, due to an issue, to ensure the compatibility of AMPs with ML Runtimes 2025.01.1, users had to switch to JupyterLab PBJ Workbench in the AMPs’
.project-metadata.yaml
file or use jobs instead of sessions for automated tasks. This issue is now resolved. (DSE-41263) - Resolved issues related to using R interactively in PBJ Runtimes. (DSE-41771)
Cloudera Data Engineering
Cloudera Data Engineering 1.23.1 introduces the following changes:
In-place upgrade enhancements
Using AWS, you can upgrade from Cloudera Data Engineering version 1.20.3 to 1.23.1. Using Azure, the minimum version of Cloudera Data Engineering for the upgrade is version 1.22.0. For more information, see Cloudera Data Engineering upgrade version compatibility and In-place upgrade with Airflow Operators and Libraries.
AWS Graviton spot instances support
With AWS Graviton, you can use spot instances as well. For more information, see AWS Graviton instances in Cloudera Data Engineering.
Data Lake 7.3.1 support
Cloudera Data Engineering version 1.23.1, besides supporting Data Lake 7.2.18, also supports Data Lake 7.3.1 with Apache Spark 3.5. For more information, see:
- Upgrading to Cloudera Data Lake 7.3.1 with Cloudera Data Engineering
- Compatibility for Cloudera Data Engineering and Runtime components
Note
Data Lake 7.3.1 supports virtual cluster creation from Spark version 3.5 onwards. Earlier Spark versions are not supported with Data Lake 7.3.1.
Java upgrade to version 17
The Java version that Airflow uses is upgraded to Java 17. For more information, see Compatibility for Cloudera Data Engineering and Runtime components.
MySQL upgrade to version 8.0.39
The MySQL version that Cloudera Data Engineering 1.23.1 uses is upgraded to version 8.0.39.
Fixed issues
- DEX-15143
Service backup failing due to Cadence size limits on the metadata - DEX-15229
UI glitch while accessing the Spark UI for a Cloudera Data Engineering Session - DEX-15398
Fix Helm upgrade failures: retrigger on context deadline exceeded and ensure correct revision on retry - DEX-15477
Rapid creation of successive job runs causes failures due to Jobs table locking - DEX-15479
Cannot upgrade Cloudera Data Engineering to 1.23 due to broken Spark Job - DEX-15498
Early unlock on the MutexMap causes incorrect locking behaviour - DEX-15587
RefreshRuns skips polling runs for Cloudera Data Engineering Job Status update for a while - DEX-15589
Livy marks the Batch failed when the monitoring thread is interrupted - DEX-15713
Jobs failing with keytab access issue
Cloudera DataFlow
Cloudera DataFlow 2.9.0–h5-b2 introduces the following changes:
Fixed issues
Fixed paging of project listing
Cloudera Control Plane was affected by an issue where the Cloudera DataFlow Projects page displayed a maximum of 10 projects, even when more were available. This limitation also affected Flow Design when selecting a Target Project for new drafts.
- Environments running Cloudera DataFlow version 2.9.0-h4-b3 do not require this upgrade and will not have the upgrade banner.
- Environments running Cloudera DataFlow version 2.9.0-h3-b1 or lower, will need to upgrade to 2.9.0-h5-b2.
Cloudera Data Warehouse
Cloudera Data Warehouse 1.9.5-b10 introduces the following changes:
Azure AKS 1.31 upgrade
Cloudera supports the Azure Kubernetes Service (AKS) version 1.31. In 1.9.5-b10 (released February 3, 2025), when you activate an Environment, Cloudera Data Warehouse automatically provisions AKS 1.31. To upgrade to AKS 1.31 from an earlier version of Cloudera Data Warehouse, you must backup and restore Cloudera Data Warehouse.
Note
Using the Azure CLI or Azure portal to upgrade the AKS cluster is not supported. Doing so can cause the cluster to become unusable and can cause downtime. For more information about upgrading, see Upgrading an Azure Kubernetes Service (AKS) cluster.
AWS EKS 1.31 upgrade
Cloudera supports the AWS Elastic Kubernetes Service (EKS) version 1.31. In 1.9.5-b10 (released February 3, 2025), when you activate an Environment, Cloudera Data Warehouse automatically provisions EKS 1.31. To upgrade to EKS 1.31 from an earlier version of Cloudera Data Warehouse, you must backup and restore Cloudera Data Warehouse.
Note
Using the AWS tools to upgrade the EKS cluster is not supported. Doing so can cause the cluster to become unusable and can cause downtime. For more information about upgrading, see Upgrading an Amazon Kubernetes Service (EKS) cluster.
Fixed issues
DWX-20070: Incompatibility with custom subdomains using .dw format
Cloudera Data Warehouse environments could not be activated when specifying a custom subdomain in the older .dw format using the --custom-subdomain
CLI flag. This resulted in certificate creation failures with the error:
error while obtaining certificate: context canceled Error Code: undefined
This issue impacted customers with the entitlement CDW_CUSTOM_CLUSTER_ID using the old .dw domain
format.
The fix includes updating the certificate creation process to support custom subdomains in the old .dw format. Cloudera Data Warehouse environments can now be activated seamlessly with custom subdomains in both legacy and updated formats however, Cloudera strongly recommends to migrate to the new format that was introduced in August, 2021, see Cloudera Data Warehouse on cloud endpoints change.
Note
Support for the old subdomain format is going to be removed in a future release.
For more information about the Known issues, see the Cloudera Data Warehouse Release Notes.
Cloudera Management Console
The latest version of Cloudera Management Console introduces the following changes:
Slack integration for Notifications - Technical Preview
Slack is added as a communication channel for the Notification service beside in-app and email. After adding the Cloudera Notifications application in Slack, you can receive the resource notifications as slack messages.
For more information, see the Setting up Slack for resource notifications documentation.
Cloudera Operational Database
Cloudera Operational Database 1.49 introduces the following changes:
Configure Security-Enhanced Linux (SELinux) enforcement using the Cloudera Operational Database UI
Cloudera Operational Database UI supports configuring the SELinux enforcement while creating a new operational database.
In the Cloudera Operational Database UI, go to Create Database > Settings > Advanced > SELinux to configure the SELinux enforcement. You can configure the SELinux option as Permissive or Enforcing.
You must have the CDP_SECURITY_ENFORCING_SELINUX
entitlement to use the SELinux support. Please contact Cloudera support if you do not have this entitlement.
For more information, see Setting SELinux Mode.
Cloudera Operational Database has removed the COD_USE_I8G_INSTANCE_TYPE entitlement
Cloudera Operational Database has removed the COD_USE_I8G_INSTANCE_TYPE
entitlement because it is not needed anymore. The I8g instance types are now public, and you can use them while creating an AWS Graviton-based Cloudera Operational Database cluster.
For more information, see AWS Graviton instances in the Cloudera Operational Database.
Storage type removal from the Cloudera Operational Database
The Cloudera Operational Database has removed the storage type Cloud Storage with Caching and Data Tiering. This type resembles cloud storage with time-based priority caching, where data within a specified time range gets a higher priority. In contrast, older data are likely to be evicted.
Now, you can use the Cloud Storage with Caching storage type for data tiering functionality.
You must have the COD_DATATIERING
entitlement to use this functionality.
For more information, see HBase Time-based Data Tiering using Persistent BucketCache.
Terraform
Version 0.8.1 of the Cloudera Terraform provider is released. For more information about the latest changes and improvements, see the changelog.