September 26, 2024
Release notes and fixed issues for version 2.0.46-b200.
New Features / Improvements
- Model Hub (Technical Preview): Model Hub is a catalog of top-performing LLM and
generative AI models. You can now easily import the models listed in the Model Hub into
the AI Registry and then deploy it using the Cloudera AI Inference service service. This streamlines the workflow of developers
working on AI use cases by simplifying the process of discovering, deploying, and testing
models.
For more information, see Using Model Hub.
- Registered Models: Registered Models offers a single view for models stored in
AI Registry instances across Cloudera Environments and facilitate easy deployment to the Cloudera AI Inference service service. When you import models from Model Hub, the models are listed under
Registered Models. This page lists all imported models and associated metadata, such as
the model’s associated environment, visibility, owner name, and created date. You can
click on any model to view details about that model, and its versions, and deploy any
specific version of the model to the Cloudera AI Inference service service.
For more information, see Using Registered Models.
- Cloudera AI Inference service (Technical Preview): Cloudera AI Inference service service is a
production-grade serving environment for traditional, generative AI, and LLM models. It is
designed to handle the challenges of production deployments, such as high availability,
fault tolerance, and scalability. The service is now available for users to carry out
inference on the following three categories of models:
- TRT-LLMs: LLMs that are optimized to TRT engine and available in NVIDIA GPU Cloud catalog, also known as NGC catalog.
- LLMs available through Hugging Face Hub.
- Traditional machine learning models like classification, regression, and so on. Models need to be imported to the AI Registry to be served using the Cloudera AI Inference service Service.
- AI Registry Standalone API: AI Registry Standalone API is now fully
supported. This new API is available from the AI Registry service to import, get,
update and delete models without relying on the Cloudera AI Workbenchservice.
For more information, see AI Registry Standalone API.
- New Amazon S3 Data Connection: A new Amazon S3 object store connection is
automatically created for Cloudera AI Workbenches to make it easier to connect to the data stored
within the same environment. Other Data Connections can be configured to other S3
locations manually.
For more information, see Setting up Amazon S3 data connection.
- Enhancements to Synced Team: Team administrators and Site administrators can now
add multiple groups to a synced team, view members of a group, delete a group within a
team, update roles for a group within a team, and update a custom role for a member within
a group.
For more information, see Managing a Synced Team.
- Auto synchronization of AI Registry with a Cloudera AI Workbench: If you deploy a
AI Registry in an environment that contains one or more Cloudera AI Workbenches, the Model
Registry is now auto-discovered and periodically synchronized by Cloudera AI Inference service
service and Cloudera AI Workbenches and no manual synchronization is required. Cloudera AI Workbench is auto-synchronized every five minutes and Cloudera
Cloudera AI Inference service service is auto-synchronized every 30 seconds.
For more information, see Synchronizing the AI Registry with a Cloudera AI Workbench.
- Environment: Support for Environment V2 is added for Cloudera AI Workbenches.
- Kubernetes: Support for AKS 1.29 and EKS 1.29 was added.
- Metering: Support for Metering V2 is added for new Cloudera AI Workbenches.
Fixed Issues
- DSE-35779: Fixed the issue related to a race condition between writing the JWT file by kinit container and reading by the engine container in the workload pod.
- DSE-37065: Previously, API V2 did not allow collaborators to be added as admin. This issue is now resolved.
- DSE-33647: Previously, workbench instances reset to default when upgraded. This issue is now resolved.