JobTracker Metrics
In addition to these base metrics, many aggregate metrics are available.
If an entity type has parents defined, you can formulate all possible
aggregate metrics using the formula
base_metric_across_parents
.
In addition, metrics for aggregate totals can be formed by adding the prefix
total_
to the front of the metric name.
Use the type-ahead feature in the Cloudera Manager chart browser to find the exact aggregate metric name, in case the plural form does not end in "s".
For example, the following metric names may be valid for JobTracker:
-
alerts_rate_across_clusters
-
total_alerts_rate_across_clusters
Some metrics, such as alerts_rate
, apply to nearly every metric context. Others only apply to a
certain service or role.
Metric Name | Description | Unit | Parents | Version |
---|---|---|---|---|
alerts_rate | The number of alerts. | events per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
blacklisted_maps_rate | Blacklisted Maps | tasks per second | cluster, mapreduce, rack | CDH 5 |
blacklisted_reduces_rate | Blacklisted Reduces | tasks per second | cluster, mapreduce, rack | CDH 5 |
cgroup_cpu_system_rate | CPU usage of the role's cgroup | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cgroup_cpu_user_rate | User Space CPU usage of the role's cgroup | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cgroup_mem_page_cache | Page cache usage of the role's cgroup | bytes | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cgroup_mem_rss | Resident memory of the role's cgroup | bytes | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cgroup_mem_swap | Swap usage of the role's cgroup | bytes | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cgroup_read_bytes_rate | Bytes read from all disks by the role's cgroup | bytes per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cgroup_read_ios_rate | Number of read I/O operations from all disks by the role's cgroup | ios per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cgroup_write_bytes_rate | Bytes written to all disks by the role's cgroup | bytes per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cgroup_write_ios_rate | Number of write I/O operations to all disks by the role's cgroup | ios per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cpu_system_rate | Total System CPU | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
cpu_user_rate | Total CPU user time | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
events_critical_rate | The number of critical events. | events per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
events_important_rate | The number of important events. | events per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
events_informational_rate | The number of informational events. | events per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
fd_max | Maximum number of file descriptors | file descriptors | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
fd_open | Open file descriptors. | file descriptors | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
health_bad_rate | Percentage of Time with Bad Health | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
health_concerning_rate | Percentage of Time with Concerning Health | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
health_disabled_rate | Percentage of Time with Disabled Health | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
health_good_rate | Percentage of Time with Good Health | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
health_unknown_rate | Percentage of Time with Unknown Health | seconds per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
heartbeats_rate | Heartbeats | operations per second | cluster, mapreduce, rack | CDH 5 |
jobs_completed_rate | Jobs Completed | jobs per second | cluster, mapreduce, rack | CDH 5 |
jobs_failed_rate | Jobs Failed | jobs per second | cluster, mapreduce, rack | CDH 5 |
jobs_killed_rate | Jobs Killed | jobs per second | cluster, mapreduce, rack | CDH 5 |
jobs_preparing | Jobs Preparing | jobs | cluster, mapreduce, rack | CDH 5 |
jobs_running | Jobs Running | jobs | cluster, mapreduce, rack | CDH 5 |
jobs_submitted_rate | Jobs Submitted | jobs per second | cluster, mapreduce, rack | CDH 5 |
jvm_blocked_threads | Blocked threads | threads | cluster, mapreduce, rack | CDH 5 |
jvm_gc_rate | Number of garbage collections | garbage collections per second | cluster, mapreduce, rack | CDH 5 |
jvm_gc_time_ms_rate | Total time spent garbage collecting. | ms per second | cluster, mapreduce, rack | CDH 5 |
jvm_heap_committed_mb | Total amount of committed heap memory. | MB | cluster, mapreduce, rack | CDH 5 |
jvm_heap_used_mb | Total amount of used heap memory. | MB | cluster, mapreduce, rack | CDH 5 |
jvm_max_memory_mb | Maximum allowed memory. | MB | cluster, mapreduce, rack | CDH 5 |
jvm_new_threads | New threads | threads | cluster, mapreduce, rack | CDH 5 |
jvm_non_heap_committed_mb | Total amount of committed non-heap memory. | MB | cluster, mapreduce, rack | CDH 5 |
jvm_non_heap_used_mb | Total amount of used non-heap memory. | MB | cluster, mapreduce, rack | CDH 5 |
jvm_runnable_threads | Runnable threads | threads | cluster, mapreduce, rack | CDH 5 |
jvm_terminated_threads | Terminated threads | threads | cluster, mapreduce, rack | CDH 5 |
jvm_timed_waiting_threads | Timed waiting threads | threads | cluster, mapreduce, rack | CDH 5 |
jvm_total_threads | Total threads | threads | cluster, mapreduce, rack | CDH 5 |
jvm_waiting_threads | Waiting threads | threads | cluster, mapreduce, rack | CDH 5 |
log_error_rate | Logged Errors | messages per second | cluster, mapreduce, rack | CDH 5 |
log_fatal_rate | Logged Fatals | messages per second | cluster, mapreduce, rack | CDH 5 |
log_info_rate | Logged Infos | messages per second | cluster, mapreduce, rack | CDH 5 |
log_warn_rate | Logged Warnings | messages per second | cluster, mapreduce, rack | CDH 5 |
map_slots | Map Slots | slots | cluster, mapreduce, rack | CDH 5 |
maps_completed_rate | Maps Completed | tasks per second | cluster, mapreduce, rack | CDH 5 |
maps_failed_rate | Maps Failed | tasks per second | cluster, mapreduce, rack | CDH 5 |
maps_killed_rate | Maps Killed | tasks per second | cluster, mapreduce, rack | CDH 5 |
maps_launched_rate | Maps Launched | tasks per second | cluster, mapreduce, rack | CDH 5 |
maps_running | Maps Running | tasks | cluster, mapreduce, rack | CDH 5 |
mem_rss | Resident memory used | bytes | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
mem_swap | Amount of swap memory used by this role's process. | bytes | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
mem_virtual | Virtual memory used | bytes | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
occupied_map_slots | Occupied Map Slots | slots | cluster, mapreduce, rack | CDH 5 |
occupied_reduce_slots | Occupied Reduce Slots | slots | cluster, mapreduce, rack | CDH 5 |
oom_exits_rate | The number of times the role's backing process was killed due to an OutOfMemory error. This counter is only incremented if the Cloudera Manager "Kill When Out of Memory" option is enabled. | exits per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
read_bytes_rate | The number of bytes read from the device | bytes per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
reduce_slots | Reduce slots | slots | cluster, mapreduce, rack | CDH 5 |
reduces_completed_rate | Reduces Completed | tasks per second | cluster, mapreduce, rack | CDH 5 |
reduces_failed_rate | Reduces failed | tasks per second | cluster, mapreduce, rack | CDH 5 |
reduces_killed_rate | Reduces Killed | tasks per second | cluster, mapreduce, rack | CDH 5 |
reduces_launched_rate | Reduces Launched | tasks per second | cluster, mapreduce, rack | CDH 5 |
reduces_running | Reduces Running | tasks | cluster, mapreduce, rack | CDH 5 |
reserved_map_slots | Reserved Map Slots | slots | cluster, mapreduce, rack | CDH 5 |
reserved_reduce_slots | Reserved Reduce Slots | slots | cluster, mapreduce, rack | CDH 5 |
trackers | TaskTrackers | TaskTrackers | cluster, mapreduce, rack | CDH 5 |
trackers_blacklisted | TaskTrackers Blacklisted | TaskTrackers | cluster, mapreduce, rack | CDH 5 |
trackers_decommissioned | TaskTrackers Decommissioned | TaskTrackers | cluster, mapreduce, rack | CDH 5 |
unexpected_exits_rate | The number of times the role's backing process exited unexpectedly. | exits per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
uptime | For a host, the amount of time since the host was booted. For a role, the uptime of the backing process. | seconds | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |
waiting_maps | Waiting Maps | tasks | cluster, mapreduce, rack | CDH 5 |
waiting_reduces | Waiting Reduces | tasks | cluster, mapreduce, rack | CDH 5 |
web_metrics_collection_duration | Web Server Responsiveness | ms | cluster, mapreduce, rack | CDH 5 |
write_bytes_rate | The number of bytes written to the device | bytes per second | cluster, mapreduce, rack | CDH 5, CDH 6, Cloudera Runtime 7 |