What's New
Major features and updates for the Cloudera Machine Learning data service.
October 10, 2024
Release notes and fixed issues for version 2.0.46-b210.
New Features / Improvements
- Model Hub is generally available (GA): Model Hub is a catalog of top-performing
models LLM and generative AI models. You can now easily import the models listed in the
Model Hub into the Model Registry and then deploy it using the Cloudera AI Inference
service.
For more information, see Using Model Hub.
- Cloudera AI Inference Enhancements:
- Added support for NVIDIA's NIM profiles requiring for the L40S GPU models.
- Made auto-scale configuration which is rendered in UI during the creation of model endpoint user-friendly. (DSE-38845)
- Optimized AI Inference UI service to become more responsive.
- User actionable error messages are now rendered in Cloudera AI Inference service UI.
For more information, see Using Cloudera AI Inference service.
Fixed Issues
- Addressed scaling issues with web services to support high active user concurrency (DSE-39597).
- CVE fixes - This release includes numerous security fixes for critical and high Common Vulnerability and Exposures (CVE).
- Fixed CORS issue to ensure that DELETE/PATCH V1 API can be used from within a workspace. (DSE-39357)
- Made the NGC service key used to download Nvidia’s optimized models more restrictive. (DSE-39475)
- Previously, users were unable to copy the model-id from AI Inference UI. This issue is now resolved. (DSE-38889)
- Authorization issues related to the listing of AI Inference applications have been addressed. (DSE-39386)
- Fixed an issue to ensure that instance type validation is correctly carried out during the creation of a new model endpoint. (DSE-39634)
- Added required validation rules for the creation of a new model endpoint. (DSE-38412)
- Addressed an issue around empty model list during navigation from registry models to deployment of models. (DSE-39634)
October 8, 2024
Release notes and fixed issues for Cloudera AI Inference service version 1.2.0-b73.
New Features / Improvements
- Cloudera AI Inference: Cloudera AI Inference
is now a fully supported data service. Cloudera AI Inference service is a production-grade
serving environment for traditional, generative AI, and Large Language Models. It is
designed to handle the challenges of production deployments, such as high availability,
fault tolerance, and scalability. The service is now available to carry out inference on
the following categories of models:
- Optimized open-source Large Language Models.
- Traditional machine learning models like classification, regression, and so on. Models need to be imported to the Cloudera Machine Learning Model Registry to be served using the Cloudera AI Inference.
For more information, see Using Cloudera AI Inference service.
September 26, 2024
Release notes and fixed issues for version 2.0.46-b200.
New Features / Improvements
- Model Hub (Technical Preview): Model Hub is a catalog of top-performing LLM and
generative AI models. You can now easily import the models listed in the Model Hub into
the Cloudera Machine Learning Model Registry and then deploy it using the Cloudera AI Inference service. This streamlines the workflow of developers
working on AI use cases by simplifying the process of discovering, deploying, and testing
models.
For more information, see Using Model Hub.
- Registered Models: Registered Models offers a single view for models stored in
Cloudera Machine Learning Model Registry instances across Cloudera Environments and facilitate easy deployment to the Cloudera AI Inference service. When you import models from Model Hub, the models are listed under
Registered Models. This page lists all imported models and associated metadata, such as
the model’s associated environment, visibility, owner name, and created date. You can
click on any model to view details about that model, and its versions, and deploy any
specific version of the model to the Cloudera AI Inference service.
For more information, see Using Registered Models.
- Cloudera AI Inference (Technical Preview): Cloudera AI Inference service is a
production-grade serving environment for traditional, generative AI, and LLM models. It is
designed to handle the challenges of production deployments, such as high availability,
fault tolerance, and scalability. The service is now available for users to carry out
inference on the following three categories of models:
- TRT-LLMs: LLMs that are optimized to TRT engine and available in NVIDIA GPU Cloud catalog, also known as NGC catalog.
- LLMs available through Hugging Face Hub.
- Traditional machine learning models like classification, regression, and so on. Models need to be imported to the Cloudera Machine Learning Model Registry to be served using the Cloudera AI Inference Service.
- Cloudera Machine Learning Model Registry Standalone API: Cloudera Machine Learning Model Registry Standalone API is now fully
supported. This new API is available from the Cloudera Machine Learning Model Registry service to import, get,
update and delete models without relying on the Cloudera Machine Learning Workspaceservice.
For more information, see Cloudera Machine Learning Model Registry Standalone API.
- New Amazon S3 Data Connection: A new Amazon S3 object store connection is
automatically created for Cloudera Machine Learning Workspaces to make it easier to connect to the data stored
within the same environment. Other Data Connections can be configured to other S3
locations manually.
For more information, see Setting up Amazon S3 data connection.
- Enhancements to Synced Team: Team administrators and Site administrators can now
add multiple groups to a synced team, view members of a group, delete a group within a
team, update roles for a group within a team, and update a custom role for a member within
a group.
For more information, see Managing a Synced Team.
- Auto synchronization of Cloudera Machine Learning Model Registry with a Cloudera Machine Learning Workspace: If you deploy a
Cloudera Machine Learning Model Registry in an environment that contains one or more Cloudera Machine Learning Workspaces, the Model
Registry is now auto-discovered and periodically synchronized by Cloudera AI Inference
service and Cloudera Machine Learning Workspaces and no manual synchronization is required. Cloudera Machine Learning Workspace is auto-synchronized every five minutes and Cloudera
Cloudera AI Inference service is auto-synchronized every 30 seconds.
For more information, see Synchronizing the Cloudera Machine Learning Model Registry with a Cloudera Machine Learning Workspace.
- Environment: Support for Environment V2 is added for Cloudera Machine Learning Workspaces.
- Kubernetes: Support for AKS 1.29 and EKS 1.29 was added.
- Metering: Support for Metering V2 is added for new Cloudera Machine Learning Workspaces.
Fixed Issues
- DSE-35779: Fixed the issue related to a race condition between writing the JWT file by kinit container and reading by the engine container in the workload pod.
- DSE-37065: Previously, API V2 did not allow collaborators to be added as admin. This issue is now resolved.
- DSE-33647: Previously, workspace instances reset to default when upgraded. This issue is now resolved.
July 17, 2024
Release notes and fixed issues for version 2.0.45-b86.
Fixed Issues
- Previously, CDP teams with the MLBusinessUser role were not available for Synced Teams in CML workspaces. This issue is now resolved.