Major features and updates for the Cloudera Machine Learning data service.
November 29, 2022
Release notes and fixed issues for version 2.0.34.
New Features / Improvements
- Suspend/Resume Workspace - Administrators can optimize cloud costs by suspending CML Workspaces for weekends and resume operation for business hours. See Suspend and resume ML workspaces for details.
- PBJ Runtimes - PBJ Workbench Runtimes are now GA, providing a more consistent experience with the Jupyter ecosystem. With the elimination of proprietary code you can now build a new Runtime from scratch, on your custom base image with your selected kernel, with no need to build on top of a Cloudera image anymore.
- Experiment tracking - A new Experiments feature built on MLFlow is
now available. You can now track your experiments in CML sessions with the preinstalled
mlflowlibrary and visualize and compare them in the CML application. This feature is enabled by default.
- Customizable Scratch Space - You can now configure the amount of ephemeral storage space (also known as scratch space) a CML session, job, application or model can use. This feature helps in better scheduling of CML pods, and provides a safety valve to ensure runaway computations do not consume all available scratch space on the node. By default, each user pod in CML is allowed to use up to 10 GB of scratch space.
- Iceberg support - Iceberg v2 is supported via Spark Runtime Addon, based on the CDE 1.17-h1 Runtime Addon version.
- Jobs UI - The Jobs List page and Job Details page now display the job ID and Created At time.
- Projects UI - When you select a runtime kernel on the project creation page, only the latest standard version of the selected kernel is added to the project.
- Data tab - The Cloudera Data Visualization application that provides the Data tab experience got upgraded to v7.0.2. You can review the changes here.
- Applications - Custom polling endpoints for Applications can now be specified in the UI.
- Register new runtimes via APIv2 - Site administrators can register new ML Runtimes using the APIv2.
- Runtime addon management - Site administrators can now also disable or deprecate specific Runtime Addons.
- Environmental variables - The new project environmental variable
PROJECT_OWNERholds the username of the project owner.
- Python packages (DSE-21313) - Fixed an issue where Python packages installed via
pip installmay not be imported correctly until a new session is started.
- Jobs (DSE-21771) - Fixed an issue where Jobs scheduled with PBJ Runtimes may terminate with a Success state even when there were errors reported in job runs.
- PBJ Runtimes (DSE-19971) - Fixed an issue where comment blocks may not be rendered correctly in sessions or jobs launched with PBJ Runtimes.
- Models (DSE-22527) - Fixed an issue where the Model Monitoring Chart was displaying data from other projects or deployments.
September 27, 2022
Release notes and fixed issues for version 2.0.32-H3.
New Features / Improvements
- Iceberg support - CML Snippets now fully support the Iceberg v1 table format for all Spark, Hive, and Impala data connections. See Create an Iceberg data connection for details.
- Data connections - Data Hubs are now automatically discovered and connections are created for them in newly-created CML Workspaces.
- Backup / Restore Workspace - The default timeout is now 12 hours, and the estimated time to complete a backup (from the cloud provider) is now periodically added to the event logs.