What's New

Cloudera AI Workbench 2.0.58-b118, Cloudera AI Registry 1.13.0-b58, and Cloudera AI Inference service 1.16.0-b19 introduce new features and functional updates.

Cloudera AI Inference service

Serving Applications on Cloudera AI Inference service GA
Cloudera AI Inference service now provides a production-grade serving environment for hosting applications. Applications deployed on Cloudera AI Inference service can scale alongside Model Endpoints, providing a scalable solution for various components. For more information, see Serving Applications on Cloudera AI Inference service.
Redesigned Configurations tab on Model Endpoint Details
The Model Endpoint Details page now features a reorganized Configurations tab as the default view. Configuration settings, including Served Models, Access Control, Resource Profile, Environment Variables, vLLM Arguments, and Tags are accessible in a left-side navigator, each in a dedicated read-only panel with inline edit access to the endpoint wizard. For more information, see Viewing details of a Model Endpoint using UI .
Archiver component migrated to Azure Workload Identity
The archiver component now uses Azure Workload Identity instead of the deprecated Azure Pod Identity to authenticate with Azure Blob Storage. This migration is fully automated during cluster provisioning and requires no user action. Each cluster now consumes one additional federated credential on the shared loggerIdentity managed identity, increasing the total used credentials from two to three. This additional credential counts toward the strict Azure limit of 20 federated credentials per managed identity.
Standardized size limit for YAML request payloads
For enhanced platform security, Cloudera AI now automatically enforces a 10MB size limit on all incoming YAML request payloads. This protective threshold is ample for standard configurations, including highly complex deployEndpoint payloads with extensive enum definitions. Typical user workflows will not be impacted, and no administrator action is required.
Advanced filters and refreshed UI layout for Model Hub
The Model Hub now features advanced facet filters, allowing you to browse and filter models quickly by category and source provider, such as Hugging Face and NGC. Additionally, the interface is now redesigned with a modern layout, featuring cleaner model cards, an updated header, and optimized screen spacing for improved usability.
UI for Inference logging configuration
You can now enable, configure, and disable input/output (I/O) logging for AI Inference services directly from the Service Details page. To facilitate compliance and auditing, the interface allows you to define a Datalake CRN and custom storage path, and provides a resolved storage URL for easy reference. The default Datalake for the environment is used automatically if you do not specify one. For more information, see Configuring Cloudera AI Inference service I/O logging.
Kubernetes 1.32 Certification
Cloudera AI Inference service is now fully certified and supported on Kubernetes 1.32. Users are strongly advised not to upgrade their clusters to later versions of Kubernetes at this time, as doing so may cause service instability or compatibility issues. Support for Kubernetes 1.34 is planned for a future release.
Knox API key support
Cloudera AI Inference service now accepts the new Knox API keys, enabling long-lived connectivity to Cloudera AI Inference service Model Endpoints. For more information, see Configuring Knox API key support for Cloudera AI Inference service.
note
The new Knox API key support is available only on Cloudera Runtime version 7.3.2.

Cloudera AI Workbench

Redesigned User Settings navigation
The AI Workbench User Settings page is now consolidated from six separate sub-tabs into a single, streamlined view featuring a vertical sidebar menu. This UI-only update provides faster, low-click access to essential configurations, such as API keys, environment variables, and team settings, directly from a unified page. Aligned with the updated Cloudera design system, this improvement is purely focused on usability and introduces no behavioral or API functionality changes.
Self-service Run As service account assignment for contributors
Project Contributors can now independently assign service accounts (machine users) to workloads using the "Run as" feature, eliminating the need for Site Administrator intervention. To maintain platform security, Contributors are strictly limited to selecting service accounts with an Operator role within the project or team. Service accounts with Admin privileges remain restricted to Site Admins, Project Admins, and Project Owners. This self-service capability applies to creating and updating jobs, applications, and models across both v1 and v2 APIs. For more information, see Creating a Workload as a Contributor.
Automated application restart functionality
Failed applications are now automatically restarted up to three times, with a five-minute delay between attempts, to minimize downtime. This feature is enabled by default, but Site Administrators can disable it globally in the Site Administration settings. For more information, see Disabling global Application restarts.
Refined Project list filters and scoping
The Projects list page now features a refined My Projects filter that displays only projects explicitly owned by the logged-in user. Shared work moves to a new My Team Projects filter for individual collaborators and team members. This behavioral change makes it easier to isolate personal projects without scrolling through shared repositories.

Cloudera AI Registry

Cloudera AI Registry in-place upgrades
Cloudera AI Registry can now be upgraded in place using a Helm upgrade on the existing release within the same namespace. This approach preserves all vital Kubernetes resources including the namespace footprint, Persistent Volume Claims (PVCs), StatefulSets, Secrets, and ClusterResourcePlacement (CRP) configurations, ensuring a seamless transition to a higher version.
Job-based model imports for Cloudera AI Registry
Model imports within the Cloudera AI Registry now run as dedicated background jobs instead of synchronous tasks. Each import tracks through sequential lifecycle states (In Queue, In Importing, Finished, and Failed) enabling users to view real-time progress logs and retrigger failed attempts directly from the UI. Additionally, new Swagger-based Model Registry API v2 endpoints allow administrators to programmatically query job details, manage scheduling limits, and customize execution timeouts.

ML Runtimes

Ubuntu 24.04 LTS upgrade for Hadoop Runtime add-on images
Hadoop Runtime add-on images are now upgraded to Ubuntu 24.04 LTS. Hadoop Runtime add-ons are incompatible with ML Runtimes lower than 2025.01.
Hardened Chainguard Runtimes
This release introduces new Hardened Edition ML Runtimes based on Chainguard images. These runtimes are designed to meet strict security standards and provide enhanced protection for your workloads and are released behind a paywall. Hardened Runtime workloads do not use the Java version provided by the Hadoop Runtime add‑on. Instead, they rely on the Java 17 installation included in the Runtime image.

Cloudera AI Control Plane

EKS and AKS 1.34 support
Cloudera AI now supports Amazon EKS and Azure AKS versions up to 1.34. This ensures compatibility with latest Kubernetes features, upstream security patches, and cloud provider optimizations.
AWS R8i memory-optimized instances support
Cloudera AI now supports AWS R8i next-generation memory-optimized instances in the AWS commercial cloud. Administrators can now select R8i instance types when provisioning the workbench.
Cloudera AI Workbench database upgraded to PostgreSQL 17
To support platform security and lifecycle maintenance, the underlying database for the Cloudera AI Workbench has been automatically upgraded to PostgreSQL 17. The migration is managed entirely within the workbench lifecycle and requires no user action.
Standardized CPU and GPU resource profile matching
To ensure optimal performance and platform stability, Cloudera AI now requires aligned CPU and GPU configurations for model training and model serving workloads. Resource profiles must provide a balanced ratio of compute resources to prevent processing bottlenecks and optimize infrastructure utilization across your deployment.
AWS G7 instance family support
Cloudera AI now supports the AWS G7 instance family for both Cloudera AI Workbench and Cloudera AI Inference service deployments, utilizing NVIDIA RTX 6000 server edition GPUs based on the Blackwell architecture (compute capability 12.0). These high-performance compute nodes are available in supported cloud regions and are automatically displayed in the instance catalog interface when they are provisioned.
note
Certain machine learning frameworks, such as TensorFlow, are not yet fully compatible with the G7 hardware architecture. Review your specific library documentation for compatibility requirements before deploying workloads.