MapReduce Metrics
In addition to these base metrics, many aggregate metrics are available.
If an entity type has parents defined, you can formulate all possible
aggregate metrics using the formula
base_metric_across_parents
.
In addition, metrics for aggregate totals can be formed by adding the prefix
total_
to the front of the metric name.
Use the type-ahead feature in the Cloudera Manager chart browser to find the exact aggregate metric name, in case the plural form does not end in "s".
For example, the following metric names may be valid for MapReduce:
-
alerts_rate_across_clusters
-
total_alerts_rate_across_clusters
Some metrics, such as alerts_rate
, apply to nearly every metric context. Others only apply to a
certain service or role.
Metric Name | Description | Unit | Parents | Version |
---|---|---|---|---|
alerts_rate | The number of alerts. | events per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
events_critical_rate | The number of critical events. | events per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
events_important_rate | The number of important events. | events per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
events_informational_rate | The number of informational events. | events per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
health_bad_rate | Percentage of Time with Bad Health | seconds per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
health_concerning_rate | Percentage of Time with Concerning Health | seconds per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
health_disabled_rate | Percentage of Time with Disabled Health | seconds per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
health_good_rate | Percentage of Time with Good Health | seconds per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
health_unknown_rate | Percentage of Time with Unknown Health | seconds per second | cluster | CDH 5, CDH 6, Cloudera Runtime 7 |
jobs_completed_rate | Jobs Completed | jobs per second | cluster | CDH 5 |
jobs_failed_rate | Jobs Failed | jobs per second | cluster | CDH 5 |
jobs_killed_rate | Jobs Killed | jobs per second | cluster | CDH 5 |
jobs_preparing | Jobs Preparing | jobs | cluster | CDH 5 |
jobs_running | Jobs Running | jobs | cluster | CDH 5 |
jobs_submitted_rate | Jobs Submitted | jobs per second | cluster | CDH 5 |
map_slots | Map Slots | slots | cluster | CDH 5 |
maps_failed_rate | Maps Failed | tasks per second | cluster | CDH 5 |
maps_running | Maps Running | tasks | cluster | CDH 5 |
mr_data_local_maps_rate | Total number of data-local map task attempts | tasks per second | cluster | CDH 5 |
mr_other_local_maps_rate | Total number of other-local map task attempts | tasks per second | cluster | CDH 5 |
mr_rack_local_maps_rate | Total number of rack-local map task attempts | tasks per second | cluster | CDH 5 |
reduce_slots | Reduce slots | slots | cluster | CDH 5 |
reduces_failed_rate | Reduces failed | tasks per second | cluster | CDH 5 |
reduces_running | Reduces Running | tasks | cluster | CDH 5 |
trackers_blacklisted | TaskTrackers Blacklisted | TaskTrackers | cluster | CDH 5 |
waiting_maps | Waiting Maps | tasks | cluster | CDH 5 |
waiting_reduces | Waiting Reduces | tasks | cluster | CDH 5 |