May 15, 2024

Release notes and fixed issues for version 2.0.45-b76.

New Features / Improvements

  • Cloudera AI Inference Service (Technical Preview): AI Inference service is a production-grade serving environment for traditional, generative AI, and LLM models. It is designed to handle the challenges of production deployments, such as high availability, fault tolerance, and scalability. The service is now available for users to carry out inference on the following three categories of models:
    • TRT-LLMs: LLMs that are optimized to TRT engine and available in NVIDIA GPU Cloud catalog, also known as NGC catalog.
    • LLMs available through Hugging Face Hub.
    • Traditional machine learning models like classification, regression, and so on.

    Models need to be imported to the model registry to be served using the Cloudera AI Inference Service.

    For more information, see Using Cloudera AI Inference service.

  • Cloudera Copilot (Technical Preview): Cloudera Copilot is a highly configurable AI coding assistant integrated with the JupyterLab editor. The Copilot improves developer productivity by debugging code, answering questions and generate notebooks.

    For more information, see Cloudera Copilot.

  • Model Registry API (Technical Preview): New API is available from the Model Registry service to import, get, update and delete models without relying on the CML Workspace service.

    For more information, see Model Registry API.

  • Ephemeral storage limit: The default ephemeral storage limit for CML Projects has been increased from 10 GB to 30 GB.

Fixed Issues

  • Fixed an error that occurs while sorting public projects on the Jobs column.
  • Fixed a bug that was uploading files to the root directory of a project instead to the specified subfolder.