Monitor Cloudera Machine Learning (CML) workspace and workload performance using Cloudera Observability

With Cloudera Observability, you can collect metrics from Cloudera Machine Learning (CML) and obtain detailed information about the resources used in the Cloudera Machine Learning (CML) service.

From the ML Summary dashboard, you can monitor multiple ML workspaces at the CML service level and manage the individual workspaces. From the Ml workspace dashboard, you can monitor, optimize, and troubleshoot ML workloads such as sessions, jobs, models, and applications, categorized by the user, team, and project.

How to enable Machine Learning feature in Cloudera Observability

To enable the Machine Learning feature in the Cloudera Observability user interface, complete the following tasks:
  • Confirm with Cloudera Support that your account is enabled for the feature from the Cloudera side with appropriate entitlements.
  • Enable the outbound traffic. For information, see AWS outbound network access destinations.
  • Installation of Cloudera Observability components on the CML workspace with Cloudera Machine Learning 2.0.46 version and higher:
    • For existing workspaces:
      • If the existing workspaces are upgraded from an older version to the Cloudera Machine Learning 2.0.46 version or higher, the Cloudera Observability components are installed automatically during the upgrade.
      • If the existing workspace is already on the latest version of Cloudera Machine Learning, you must suspend the workspace, and then resume the workspace. The Cloudera Observability components are enabled automatically within 12 hours.

        For information on suspending and resuming the workspace, see Suspend and resume ML workspaces in Cloudera Machine Learning documentation.

    • For new workspaces created with the Cloudera Machine Learning version 2.0.46 or higher, the Cloudera Observability components are installed automatically.

      For information on creating a new workspace, see Provisioning an ML Workspace in Cloudera Machine Learning documentation.