What's New

This section lists major features and updates for the Cloudera Machine Learning service.

February 13, 2020

  • Kerberos Authentication Improvements - Previously, users needed to manually authenticate themselves by entering their CDP FreeIPA credentials into their CML workspaces (on the Account settings > Hadoop Authentication page). This is no longer required.

    Users will now automatically receive the Kerberos credentials required for any CML workloads such as sessions, jobs, models, etc. Existing workspaces can be upgraded to take advantage of this improvement.

    Additionally, the Hadoop Authentication tab has been removed from the workspace UI.

  • Workspace Monitoring Enabled - When you provision a Machine Learning workspace, Monitoring is enabled by default under Advanced Options.

January 30, 2020

  • Base Engine v11 - The default base engine is now version 11. The only change to the included libraries is the R interpreter, which is updated to version 3.6.2.
  • Python 2 checkbox disabled - Python 2 sessions are now disabled by default on new clusters, but can be re-enabled by admin users.
  • Load Balancer Source Ranges - When provisioning a Machine Learning workspace, in Advanced Settings, you can enter the CIDR range of IP addresses allowed to connect to the workspace. You must whitelist the entire IP pool for your VPC to ensure that terminal sessions can connect to the workspace.
  • Updated Open Workbench button name - When launching a session, the Open Workbench button has been renamed to New Session.
  • Login error message - Fixed a bug where users might see the following error message upon login to a CML workspace: "Email already associated with an account."

December 19, 2019

  • Monitoring Workspaces with Grafana - CML now leverages Prometheus and Grafana to provide a dashboard that allows you to monitor how CPU, memory, storage, and other resources are being consumed by ML workspaces.
  • Custom Quotas - CML workspace site administrators can now enable custom quotas to set resource usage per user.
  • Tags - There are three new AWS resource tracking tags:
    • Cloudera-Resource-Name: <The CRN of the associated CML Workspace for which the resource was provisioned.>
    • Cloudera-Environment-Resource-Name: <The CRN of the Environment in which the resource was created.>
    • Cloudera-Creator-Resource-Name: <The CRN of the CDP user who requested creation of the resource.>

    AWS resource tags are set by default. They can be searched and viewed through the AWS console or CLI. These tags are helpful for tracking resource usage and cost.

  • Granting remote access - The procedure to grant and revoke remote access to ML workspaces is improved. You can easily add new users. You can also see which users currently have access, and then quickly revoke access to specific users.
  • CML instance type cost reduced - The node used to run the CML application was downsized to a more economical AWS instance type. The instance type was changed from m5.12xlarge to m5.4xlarge, which should result in a noticeable reduction in cloud costs.
  • Base Engine v10-cml1.3 - The default engine is now v10. See the package listing for the updated versions of included libraries.
    • Python 3 is version 3.6.9 (was 3.6.8 in Engine v8).
    • Python 2 is version 2.7.17 (was 2.7.11 in Engine v8).

November 1, 2019

  • Analytical Applications - CML now gives data scientists a way to create long-running standalone ML web applications/dashboards that can easily be shared with other business stakeholders.
  • Quotas - CML workspace site administrators can now enable CPU, GPU, and memory usage quotas per user. Quotas must be enabled separately for each workspace.

    Note: The Quotas feature is in Technical Preview.

  • Diagnostic Bundles - CML now allows site administrators to download diagnostic bundles from the Site Admin panel.
  • UI Improvements and Changes
    • You can now display the Details page by clicking on the Workspace Name in the ML Workspaces page.
    • The ML Workspace Details page now contains a Events tab.

      The Events tab displays high-level events for your workspace. You can click View Logs to display additional log information about the action.

    • The ML Workspace Details page now displays AWS workspace tags.
      The tag information displays both default tags and any tags you have specified. You can specify these workspace tags in the provision workspace page, under the Advanced options. The default workspace tags include:
      • Creator
      • Environment
      • Owner
      • WorkspaceName

September 23, 2019

  • Single Sign-on (SSO) Changes - You no longer need to create separate authorization groups to grant users SSO access to workspaces. Authorization groups are now managed per-environment using the MLAdmin and MLUser resource roles. For details, see Configuring User Access to CML.
  • Resource Tags - You can now add resource tags to all the cloud infrastructure, compute, and storage resources used by an ML workspace you provision.
  • Remove Workspaces - New options added that allow you to retain project storage (in EFS) and force delete workspaces from CDP.
  • View Workspace Details - Each workspace now has an associated details page where you can access links to the workspace itself, the environment where the workspace was created, and the underlying Kubernetes cluster (links to cloud service provider). A link to this page is available under the Actions menu.
  • Search and Filter workspaces - New filter that allows you to display workspaces in a specific environment. You can also search for a workspace by name.

August 22, 2019

This marks the GA release of Cloudera Machine Learning.