Monitoring Data Engineering service resources with Grafana dashboards

Grafana is a visualisation and analytics software that enables the development of dashboards to monitor metrics data. You can access pre-built Grafana dashboards to monitor your jobs and virtual clusters in Cloudera Data Engineering (CDE).

The CDP metrics are stored centrally in the Prometheus database and monitored by Prometheus. Grafana uses these metrics for data visualization. Your workload databases are not involved in any way.

You can immediately view the following pre-built dashboards for viewing runtime metrics in CDE:

Kubernetes Dashboard

This dashboard includes generalized visualizations of CDE job run statuses. It displays the following information:
  • Number of succeeded, failed, and killed jobs for the given period
  • Total number of jobs in the Starting phase
  • Total number of jobs in the Running phase

Virtual Cluster Metrics Dashboard

This dashboard includes visualizations of service requests, pod counts and job statuses for the selected Virtual Cluster. The available metrics are:
  • Time series of CPU requests of running pods (includes virtual cluster service overhead)
  • Time series of memory requests of running pods (includes virtual cluster service overhead)
  • The response time of Livy's requests
  • Time series for the number of pods in running and pending states
  • Total number of running pods and pending pods
  • Time series of starting and running jobs, and the total number of successful jobs