This topic shows you how to monitor resource usage on your ML
workspaces.
Cloudera Machine Learning leverages Prometheus and Grafana to
provide a dashboard that allows you to monitor how CPU, memory, storage,
and other resources are being consumed by ML workspaces. Prometheus is an
internal data source that is auto-populated with resource consumption data
for each workspace. Grafana is a monitoring dashboard that allows you to
create visualizations for resource consumption data from Prometheus.
Each ML workspace has its own Grafana dashboard.
Required Role: MLAdmin
Without the MLAdmin role, you will not be able to view the Workspace
details page.
Log in to the CDP web interface.
Click ML Workspaces.
For the workspace you want to monitor, click Actions > Open Grafana.
Alternatively, in Actions > Overview, click Grafana Dashboard.
CML provides you with CML Monitoring, a default Grafana
dashboard which includes panels on CPU usage, memory usage, running processes,
autoscaling, and network I/O on the workspace. You might choose to extend this dashboard
or create more panels for other metrics. For more information, see the Grafana
documentation.