This is the documentation for Cloudera Manager 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Cloudera Manager Metrics

This guide provides information on metrics supported by Cloudera Manager.

A metric is a property that can be measured to quantify the state of an entity or activity. They include properties such as the number of open file descriptors or CPU utilization percentage across your cluster.

Cloudera Manager monitors a number of performance metrics for services and role instances running on your clusters. These metrics are monitored against configurable thresholds and can be used to indicate whether a host is functioning as expected or not. You can view these metrics in the Cloudera Manager Admin Console which displays metrics about your jobs (such as the number of currently running jobs and their CPU/memory usage), Hadoop services (such as the average HDFS I/O latency and number of concurrent jobs), your clusters (such as average CPU load across all your hosts) and so on.

Cloudera Manager also pre-aggregates metrics to allow you to access them through charts. Metrics are aggregated from their generating entity to the larger entities that they are part of. For example, metrics generated by disks, network interfaces, and filesystems are aggregated to their respective hosts and clusters. See Metric Aggregation for more details.

If you are in the Cloudera Manager Admin Console, then one way to discover which metrics are collected by Cloudera Manager is to navigate to Charts > Chart Builder and click on the List of Metrics link to the right of the Build Chart button. Another way to discover metrics is to use the tsquery language to simply retrieve all metrics for the type of entity you are interested in. The tsquery language is the language used to specify statements for retrieving time series data, that is, a stream of metric data points with each point containing a timestamp and the value of the metric at that timestamp.

You can also chart these metrics over a time range. See Viewing Charts for Cluster, Service, Role, and Host Instances for more details. The metrics listed in the table of contents to the left include a short description as well as their units and the version of CDH they are applicable to.

  Note: The sampling rate is one minute for all the metrics in this guide.
Page generated September 3, 2015.