What's New

Major features and updates for the Cloudera Machine Learning data service.

October 10, 2024

Release notes and fixed issues for version 2.0.46-b210.

New Features / Improvements

  • Model Hub is generally available (GA): Model Hub is a catalog of top-performing models LLM and generative AI models. You can now easily import the models listed in the Model Hub into the Model Registry and then deploy it using the Cloudera AI Inference service.

    For more information, see Using Model Hub.

  • Cloudera AI Inference service Enhancements:
    • Added support for NVIDIA's NIM profiles requiring for the L40S GPU models.
    • Made auto-scale configuration which is rendered in UI during the creation of model endpoint user-friendly. (DSE-38845)
    • Optimized AI Inference UI service to become more responsive.
    • User actionable error messages are now rendered in Cloudera AI Inference service UI.

      For more information, see Using Cloudera AI Inference service.

Fixed Issues

  • Addressed scaling issues with web services to support high active user concurrency (DSE-39597).
  • CVE fixes - This release includes numerous security fixes for critical and high Common Vulnerability and Exposures (CVE).
  • Fixed CORS issue to ensure that DELETE/PATCH V1 API can be used from within a workspace. (DSE-39357)
  • Made the NGC service key used to download Nvidia’s optimized models more restrictive. (DSE-39475)
  • Previously, users were unable to copy the model-id from AI Inference UI. This issue is now resolved. (DSE-38889)
  • Authorization issues related to the listing of AI Inference applications have been addressed. (DSE-39386)
  • Fixed an issue to ensure that instance type validation is correctly carried out during the creation of a new model endpoint. (DSE-39634)
  • Added required validation rules for the creation of a new model endpoint. (DSE-38412)
  • Addressed an issue around empty model list during navigation from registry models to deployment of models. (DSE-39634)

October 8, 2024

Release notes and fixed issues for Cloudera AI Inference service version 1.2.0-b73.

New Features / Improvements

  • Cloudera AI Inference service: Cloudera AI Inference service is now a fully supported data service. Cloudera AI Inference service is a production-grade serving environment for traditional, generative AI, and Large Language Models. It is designed to handle the challenges of production deployments, such as high availability, fault tolerance, and scalability. The service is now available to carry out inference on the following categories of models:
    • Optimized open-source Large Language Models.
    • Traditional machine learning models like classification, regression, and so on. Models need to be imported to the Cloudera Model Registry to be served using the Cloudera AI Inference service.

    For more information, see Using Cloudera AI Inference service.

September 26, 2024

Release notes and fixed issues for version 2.0.46-b200.

New Features / Improvements

  • Model Hub (Technical Preview): Model Hub is a catalog of top-performing LLM and generative AI models. You can now easily import the models listed in the Model Hub into the Cloudera Model Registry and then deploy it using the Cloudera AI Inference service service. This streamlines the workflow of developers working on AI use cases by simplifying the process of discovering, deploying, and testing models.

    For more information, see Using Model Hub.

  • Registered Models: Registered Models offers a single view for models stored in Cloudera Model Registry instances across Cloudera Environments and facilitate easy deployment to the Cloudera AI Inference service service. When you import models from Model Hub, the models are listed under Registered Models. This page lists all imported models and associated metadata, such as the model’s associated environment, visibility, owner name, and created date. You can click on any model to view details about that model, and its versions, and deploy any specific version of the model to the Cloudera AI Inference service service.

    For more information, see Using Registered Models.

  • Cloudera AI Inference service (Technical Preview): Cloudera AI Inference service service is a production-grade serving environment for traditional, generative AI, and LLM models. It is designed to handle the challenges of production deployments, such as high availability, fault tolerance, and scalability. The service is now available for users to carry out inference on the following three categories of models:
    • TRT-LLMs: LLMs that are optimized to TRT engine and available in NVIDIA GPU Cloud catalog, also known as NGC catalog.
    • LLMs available through Hugging Face Hub.
    • Traditional machine learning models like classification, regression, and so on. Models need to be imported to the Cloudera Model Registry to be served using the Cloudera AI Inference service Service.
  • Cloudera Model Registry Standalone API: Cloudera Model Registry Standalone API is now fully supported. This new API is available from the Cloudera Model Registry service to import, get, update and delete models without relying on the Cloudera Machine Learning Workspaceservice.

    For more information, see Cloudera Model Registry Standalone API.

  • New Amazon S3 Data Connection: A new Amazon S3 object store connection is automatically created for Cloudera Machine Learning Workspaces to make it easier to connect to the data stored within the same environment. Other Data Connections can be configured to other S3 locations manually.

    For more information, see Setting up Amazon S3 data connection.

  • Enhancements to Synced Team: Team administrators and Site administrators can now add multiple groups to a synced team, view members of a group, delete a group within a team, update roles for a group within a team, and update a custom role for a member within a group.

    For more information, see Managing a Synced Team.

  • Auto synchronization of Cloudera Model Registry with a Cloudera Machine Learning Workspace: If you deploy a Cloudera Model Registry in an environment that contains one or more Cloudera Machine Learning Workspaces, the Model Registry is now auto-discovered and periodically synchronized by Cloudera AI Inference service service and Cloudera Machine Learning Workspaces and no manual synchronization is required. Cloudera Machine Learning Workspace is auto-synchronized every five minutes and Cloudera Cloudera AI Inference service service is auto-synchronized every 30 seconds.

    For more information, see Synchronizing the Cloudera Model Registry with a Cloudera Machine Learning Workspace.

  • Environment: Support for Environment V2 is added for Cloudera Machine Learning Workspaces.
  • Kubernetes: Support for AKS 1.29 and EKS 1.29 was added.
  • Metering: Support for Metering V2 is added for new Cloudera Machine Learning Workspaces.

Fixed Issues

  • DSE-35779: Fixed the issue related to a race condition between writing the JWT file by kinit container and reading by the engine container in the workload pod.
  • DSE-37065: Previously, API V2 did not allow collaborators to be added as admin. This issue is now resolved.
  • DSE-33647: Previously, workspace instances reset to default when upgraded. This issue is now resolved.

July 17, 2024

Release notes and fixed issues for version 2.0.45-b86.

Fixed Issues

  • Previously, CDP teams with the MLBusinessUser role were not available for Synced Teams in CML workspaces. This issue is now resolved.