Agent Metrics
In addition to these base metrics, many aggregate metrics are available. If an entity type has parents defined, you can formulate all possible aggregate metrics using the formula base_metric_across_parents.
In addition, metrics for aggregate totals can be formed by adding the prefix total_ to the front of the metric name.
Use the type-ahead feature in the Cloudera Manager chart browser to find the exact aggregate metric name, in case the plural form does not end in "s".
For example, the following metric names may be valid for Agent:
- alerts_rate_across_clusters
- total_alerts_rate_across_clusters
Some metrics, such as alerts_rate, apply to nearly every metric context. Others only apply to a certain service or role.
For more information about metrics, see Cloudera Manager Metrics and Metric Aggregation.
Metric Name | Description | Unit | Parents | CDH Version |
---|---|---|---|---|
alerts_rate | The number of alerts. | events per second | cluster, flume, rack | CDH 5 |
cgroup_cpu_system_rate | CPU usage of the role's cgroup | seconds per second | cluster, flume, rack | CDH 5 |
cgroup_cpu_user_rate | User Space CPU usage of the role's cgroup | seconds per second | cluster, flume, rack | CDH 5 |
cgroup_mem_page_cache | Page cache usage of the role's cgroup | bytes | cluster, flume, rack | CDH 5 |
cgroup_mem_rss | Resident memory of the role's cgroup | bytes | cluster, flume, rack | CDH 5 |
cgroup_mem_swap | Swap usage of the role's cgroup | bytes | cluster, flume, rack | CDH 5 |
cgroup_read_bytes_rate | Bytes read from all disks by the role's cgroup | bytes per second | cluster, flume, rack | CDH 5 |
cgroup_read_ios_rate | Number of read I/O operations from all disks by the role's cgroup | ios per second | cluster, flume, rack | CDH 5 |
cgroup_write_bytes_rate | Bytes written to all disks by the role's cgroup | bytes per second | cluster, flume, rack | CDH 5 |
cgroup_write_ios_rate | Number of write I/O operations to all disks by the role's cgroup | ios per second | cluster, flume, rack | CDH 5 |
cpu_system_rate | Total System CPU | seconds per second | cluster, flume, rack | CDH 5 |
cpu_user_rate | Total CPU user time | seconds per second | cluster, flume, rack | CDH 5 |
events_critical_rate | The number of critical events. | events per second | cluster, flume, rack | CDH 5 |
events_important_rate | The number of important events. | events per second | cluster, flume, rack | CDH 5 |
events_informational_rate | The number of informational events. | events per second | cluster, flume, rack | CDH 5 |
fd_max | Maximum number of file descriptors | file descriptors | cluster, flume, rack | CDH 5 |
fd_open | Open file descriptors. | file descriptors | cluster, flume, rack | CDH 5 |
health_bad_rate | Percentage of Time with Bad Health | seconds per second | cluster, flume, rack | CDH 5 |
health_concerning_rate | Percentage of Time with Concerning Health | seconds per second | cluster, flume, rack | CDH 5 |
health_disabled_rate | Percentage of Time with Disabled Health | seconds per second | cluster, flume, rack | CDH 5 |
health_good_rate | Percentage of Time with Good Health | seconds per second | cluster, flume, rack | CDH 5 |
health_unknown_rate | Percentage of Time with Unknown Health | seconds per second | cluster, flume, rack | CDH 5 |
mem_rss | Resident memory used | bytes | cluster, flume, rack | CDH 5 |
mem_swap | Amount of swap memory used by this role's process. | bytes | cluster, flume, rack | CDH 5 |
mem_virtual | Virtual memory used | bytes | cluster, flume, rack | CDH 5 |
oom_exits_rate | The number of times the role's backing process was killed due to an OutOfMemory error. This counter is only incremented if the Cloudera Manager "Kill When Out of Memory" option is enabled. | exits per second | cluster, flume, rack | CDH 5 |
read_bytes_rate | The number of bytes read from the device | bytes per second | cluster, flume, rack | CDH 5 |
unexpected_exits_rate | The number of times the role's backing process exited unexpectedly. | exits per second | cluster, flume, rack | CDH 5 |
uptime | For a host, the amount of time since the host was booted. For a role, the uptime of the backing process. | seconds | cluster, flume, rack | CDH 5 |
web_metrics_collection_duration | Web Server Responsiveness | ms | cluster, flume, rack | CDH 5 |
write_bytes_rate | The number of bytes written to the device | bytes per second | cluster, flume, rack | CDH 5 |