Cloudera on Cloud: January 2025 Release Summary

The Release Summary of Cloudera on cloud summarizes major features introduced in Cloudera Management Console, Cloudera Data Hub, and data services.

Cloudera AI

Cloudera AI 2.0.46-b302 introduces the following changes:

New features/improvements

Migrated Cloudera AI Workbench, AI Registry, and Cloudera AI Inference service images to chainguard to address CVEs.
Added APIv2 support for Enhanced Group Sync.
Added support to create AMPs (Cloudera Accelerators for Machine Learning Projects) using APIv2. Previously, this option was available only using UI.
Added support for H100 GPU instances for Cloudera AI Inference service on Azure.
Added support for AKS workload identity.
Added support for AWS M7a, M7i, C7a, C7i, R7a, R7i instance families.
Added support for Cloudera AI Inference service on EU Control Plane.
Added support for EKS 1.30.
Added support for AKS 1.30.
Hugging Face support (Technical Preview): You can now import text-generating language models from Hugging Face and deploy them on Cloudera AI Inference service.
Added profiles for HuggingFace Models and multi-modal models in the Model Hub catalog.
Updated existing model manifests in the catalog after upgrading the NIM version in Cloudera AI Inference service.
Enhanced error messages related to model import failure in the Model Hub UI.
Carried enhancements in AI Registry to ensure that multi-modals can be supported.
Added runtime support for Llama 3.2 11B and 90B Vision Language Model NIMs to ensure that they can be deployed using AI Inference. Only model profiles optimized for the H100 GPU are supported for these two models in this release.
Llama 3 NIM is no longer supported since we now have both Llama 3.1 and Llama 3.2.
Added support for Diagnostic Bundles in Cloudera AI Inference service.
Upgraded text-generating and embedding NIMs.
Added Code Sample functionality for endpoints deployed using Cloudera AI Inference service.
Model endpoint replica events can now be viewed on the Model Endpoint details UI.You can now add numerous docker credentials using UI or API which can be used to enable Cloudera AI to fetch custom ML Runtimes from a secure repository. For more information, see Add Docker registry credentials and certificates.

Fixed Issues

Previously, some Cloudera AI Inference service clusters did not have the ‘creationDate’ field. This field is now added.(DSE-38817)
Previously, the deletion of backup for older workspaces was failing. This issue is now resolved. (DSE-41031)
Previously, deleting a workbench backup created by a deleted user displayed an error. This issue is now resolved. (DSE-41052)
Multiple UI improvements are made both in the Create, Read, Update, and Delete operations of Cloudera AI Inference service and while deploying or editing a model endpoint.
The model_name field is now displayed instead of model_id in the Endpoint Details UI. (DSE-38937)
Previously, the NIM model profile environment variable was only assigned for LLMs. Now support for Model Profile override is added for Embedding and Reranker NIMs. (DSE-40508)
Previously, there was an issue with rendering of existing instance type in the “Edit Endpoint” UI. This issue is now resolved. (DSE-40636)
Validated all node group (instance type) selection from UI. (DSE-40754)
Previously, NGC manifest components were missing from the download. This issue is now resolved. (DSE-41055)
The Create ML Serving application now enables the public load balancer. (DSE-41305)
The Instance Type field in the Edit Model Endpoint UI is no longer mandatory. (DSE-41278)
Added force delete option to delete the Cloudera AI Inference service using UI. (DSE-41035)
The Cloudera AI Inference service UI now displays optimization profile details. (DSE-40927)
You can now create, download, and delete log archives for Cloudera AI Inference service. (DSE-40921)
The Test Model UI now fails gracefully when the replica is scaled down to zero for a model deployed using Cloudera AI Inference service. (DSE-40957)
Previously, the Storage initializer had the wrong task values. This issue is now resolved. (DSE-41058)
Enabled storage initializer to now handle more than two directories for NIM artifacts. (DSE-40986)
Removed Llama 3 runtimes. (DSE-40956)
Addressed SQL injection issue in AI Registry that allowed non-authorized but authenticated users to perform Create, Read, Update, and Delete operations on AI Registry’s metadata tables. (DSE-41542)

Cloudera Data Catalog

Cloudera Data Catalog 2025-M1 introduces the following new changes:

Containerized architecture for profilers
Cloudera Data Catalog introduces a new containerized architecture for Profilers for Compute Cluster enabled environments, providing a scalable environment:

Only the required amount of Kubernetes pods are launched based on the size of the database to be profiled. You need to pay only for the used cloud resources only while they are used by the profilers.
Also, the deployment of the containerized profiler architecture is more streamlined and quicker than the previous VM-based architecture.
Moreover, the containerized nature of the architecture means that later upgrades can be carried out easier, without the need for multiple dependencies as in the VM-based architecture which used multiple services.
Profilers now also support the following file formats:
- VM-based environments: CSV, ORC
- Compute Cluster enabled environments: CSV and Parquet
  - Hive Column Profilers and Cluster Sensitivity Profilers also support profiling Iceberg Tables, including with On-Demand Profilers.
Important
- Currently, Kubernetes based profilers are only supported in AWS environments.
- In Compute Cluster enabled environments, profilers only support tables which are stored on AWS S3 storage.

For more information, see Profiler architecture in Compute Cluster enabled environment.

Note
The VM-based architecture (using the Cloudera Data Hub Cluster) is deprecated from this release but remains available until CDP Public Cloud 7.2.18 is supported (Sept 2025). Therefore, Cloudera Data Hub based profilers will also not be available in CDP Public Cloud versions after 7.2.18. Only Compute Cluster enabled environment will be able to run Cloudera Data Catalog profilers after version 7.2.18.

For more information, see Cloudera Support lifecycle policy.

Redesigned Dashboard menu A new Dashboard is introduced to give an overview of your data lakes and profilers including: - Data lake type and status - Profiler status - Last 10 assets bookmarked by you - Last run of profiler - Number of assets profiled

Redesigned Search menu
The Search menu is reorganized so information is easier to access. You can expand each entity result to see their qualified name, database, classification and assigned terms. You can use these to check if your query returns the expected results.

*Improved display of comments in Asset Details
Following this release, you can hover over the Comment field for individual schema entries in Asset Details to preview longer comments without opening them.

Common time format
Asset Details and other menus will use the same time format for a more readable overview: MM/DD/YYYY hh:mm A.

Removed features
The following features have been removed:

Tag Rules > Luhn check algorithm
Tag Rules > File-based Allow and Deny list
Tag Rules > Lookup files
Tagging multiple assets in the Search menu

Cloudera Data Hub

The latest version of Cloudera Data Hub introduces the following changes:

Added YARN recommendations to Cloudera Data Hub scaling activities CLI output
The list of Cloudera Data Hub scaling activities in the output of the cdp datahub list-scaling-activities command have been extended with YARN recommendations. For more information, see Managing autoscaling.

Cloudera Management Console

The latest version of Cloudera Management Console introduces the following changes:

Added support for upgrading CMK enabled Azure Single Servers to Flexible Server
A managed identity is required for being able to upgrade a CMK enabled Azure Single Server to Azure Flexible Server. It is now possible to add a managed identity to an environment with Azure Single Server that is already encrypted with a Customer Managed Key thus making the upgrade from Azure Single Server to Azure Flexible Server possible for environments already configured with a CMK.
For more information, see Upgrading Azure Single Server to Flexible Server Prerequisites.

Imported Compute Engine images encrypted with CMEK
If you set a CMEK for your GCP environment, then the imported Compute Engine images will be encrypted with the CMEK instead of the default Google-managed key.
For more information, see Adding a customer managed encryption key for GCP.

Enabling Secure Boot option for GCP (Preview)
VMs on GCP are created without the Secure Boot option enabled by default. You can request to have the Secure Boot option for subsequently provisioned VMs.
For more information, see Enabling Secure Boot option for GCP.

Note
You need to contact Cloudera Support to have this feature enabled.

Cloudera Observability

The latest version of Cloudera Observability introduces the following changes:

Resource Efficiency Analysis: Added support for Spark. For more information, see Query and job resource optimization using resource efficiency analysis.
GPU support enhancements: Added GPU and GPU memory support to the Cloudera AI summary workspace and node-level allocation or consumption. For more information, see Resource utilization across workspaces and Resource utilization within the workspace.

Cloudera Operational Database

Cloudera Operational Database 1.48 introduces the following changes:

Cloudera Operational Database supports Security-Enhanced Linux (SELinux) enforcement
Cloudera Operational Database supports creating a database with SELinux enforcement using the CDP CLI.

Note
This feature is under technical preview. To use this feature, you must have the CDP_SECURITY_ENFORCING_SELINUX entitlement in your Cloudera environment. Contact Cloudera support if you do not have this entitlement.

The SELinux allows you to set access control through policies. You can set the SELinux mode while creating a new operational database. You can define the SELinux mode using the seLinux parameter in the create-database command.
The supported SELinux modes are:

ENFORCING: Enables SELinux in enforced mode, actively applying security policies.
PERMISSIVE (default): Sets SELinux to permissive mode, logging any security violations without enforcing policies. The PERMISSIVE mode is applied by default if you do not define the seLinux parameter.

The following example shows the usage of the seLinux parameter.

opdb create-database --environment-name [******] --database-name [******] --security-request '{"seLinux": string}'
opdb create-database --environment-name cod-7218-micro1 --database-name testDB --security-request '{"seLinux": "ENFORCING"}'
opdb create-database --environment-name cod-7218-micro1 --database-name testDB --security-request '{"seLinux": "PERMISSIVE"}'

For more information, see CDP CLI documentation and Setting SELinux Mode.

Assign ODAdmin role at the database level
Cloudera Operational Database supports setting a user as an ODAdmin at the database level. In earlier versions of the Cloudera Operational Database, you could set the ODAdmin at the environment level only, however, for better usability and enhanced security, now you can set it at the database level too.