Monitoring ML Workspaces

This topic shows you how to monitor resource usage on your ML workspaces.

Cloudera Machine Learning leverages Prometheus and Grafana to provide a dashboard that allows you to monitor how CPU, memory, storage, and other resources are being consumed by ML workspaces. Prometheus is an internal data source that is auto-populated with resource consumption data for each workspace. Grafana is a monitoring dashboard that allows you to create visualizations for resource consumption data from Prometheus.

Each ML workspace has its own Grafana dashboard.

Required Role: MLAdmin

Without the MLAdmin role, you will not be able to view the Workspace details page.

  1. Log in to the CDP web interface at https://console.us-west-1.cdp.cloudera.com using your corporate credentials or any other credentials that you received from your CDP administrator.
  2. Click ML Workspaces.
  3. For the workspace you want to monitor, click Actions > Open Grafana.
  4. Alternatively, in Actions > Overview, click Grafana Dashboard.

    CML provides you with CML Monitoring, a default Grafana dashboard which includes panels on CPU usage, memory usage, running processes, autoscaling, and network I/O on the workspace. You might choose to extend this dashboard or create more panels for other metrics. For more information, see the Grafana documentation.