Host Metrics
Reference information for Host Metrics
In addition to these base metrics, many aggregate metrics are available.
If an entity type has parents defined, you can formulate all possible
aggregate metrics using the formula
base_metric_across_parents
.
In addition, metrics for aggregate totals can be formed by adding the prefix
total_
to the front of the metric name.
Use the type-ahead feature in the Cloudera Manager chart browser to find the exact aggregate metric name, in case the plural form does not end in "s".
For example, the following metric names may be valid for Host:
-
agent_cert_expiry_across_clusters
-
total_agent_cert_expiry_across_clusters
Some metrics, such as alerts_rate
, apply to nearly every metric context. Others only apply to a
certain service or role.
agent_cert_expiry
- Description
- Remaining days until the expiry of the certificate of Cloudera Manager Agent
- Unit
- seconds
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
agent_cpu_system_rate
- Description
- Cloudera Manager Agent System CPU Time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
agent_cpu_user_rate
- Description
- Cloudera Manager Agent User CPU Time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
agent_fd_max
- Description
- Cloudera Manager Agent File Descriptor Max
- Unit
- file descriptors
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
agent_fd_open
- Description
- Cloudera Manager Agent File Descriptors
- Unit
- file descriptors
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
agent_hb_latency_millis
- Description
- Heartbeat latency observed by Cloudera Manager Agent communicating to Cloudera Manager Server
- Unit
- ms
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
agent_physical_memory_used
- Description
- Agent physical memory used
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
agent_virtual_memory_used
- Description
- Agent virtual memory used
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
alerts_rate
- Description
- The number of alerts.
- Unit
- events per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
available_entropy
- Description
- The entropy that is available on the host
- Unit
- entropy
- Parents
- CDH Version
- [CM -1.0.0..CM -1.0.0]
clock_offset
- Description
- Clock offset as reported by the host's NTP service from 'ntpdc -np' or 'chronyc sources'. If NTP is not in use, this metric is not collected.
- Unit
- ms
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cores
- Description
- Logical CPU Cores
- Unit
- cores
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_guest_nice_rate
- Description
- Time spent running a niced guest (virtual CPU for guest operating systems under the control of the Linux kernel). Requires Linux 2.6.33. CPU guest nice time is included in CPU nice time.
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_guest_rate
- Description
- Time spent running a virtual CPU for guest operating systems under the control of the Linux kernel. Requires Linux 2.6.24. CPU guest time is included in CPU user time.
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_idle_rate
- Description
- Total CPU idle time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_iowait_rate
- Description
- Total CPU iowait time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_irq_rate
- Description
- Total CPU IRQ time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_nice_rate
- Description
- Total CPU nice time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_percent
- Description
- Total CPU usage of the host (averaged since last report)
- Unit
- percent
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_soft_irq_rate
- Description
- Total CPU soft IRQ time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_steal_rate
- Description
- Stolen time, which is the time spent in other operating systems when running in a virtualized environment. Requires Linux 2.6.11.
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_system_rate
- Description
- Total System CPU
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
cpu_user_rate
- Description
- Total CPU user time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
dns_name_resolution_duration
- Description
- The duration of a call to InetAddress.getLocalHost() in a helper java process run by the Cloudera Manager Agent.
- Unit
- ms
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
events_critical_rate
- Description
- The number of critical events.
- Unit
- events per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
events_important_rate
- Description
- The number of important events.
- Unit
- events per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
events_informational_rate
- Description
- The number of informational events.
- Unit
- events per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
fd_max
- Description
- Maximum number of file descriptors
- Unit
- file descriptors
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
fd_open
- Description
- Open file descriptors.
- Unit
- file descriptors
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
health_bad_rate
- Description
- Percentage of Time with Bad Health
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
health_concerning_rate
- Description
- Percentage of Time with Concerning Health
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
health_disabled_rate
- Description
- Percentage of Time with Disabled Health
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
health_good_rate
- Description
- Percentage of Time with Good Health
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
health_unknown_rate
- Description
- Percentage of Time with Unknown Health
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
hmon_message_bytes_sent_rate
- Description
- Number of bytes sent in messages from the Cloudera Manager Agent to the Cloudera Host Monitor
- Unit
- bytes per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
hmon_message_transmit_duration
- Description
- The wall-clock time it took to transmit the most recent Cloudera Manager Agent message to the Cloudera Host Monitor
- Unit
- ms
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
hmon_message_transmit_failed_rate
- Description
- Number of failures to send messages from the Cloudera Manager Agent to the Cloudera Host Monitor
- Unit
- messages per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
hmon_message_transmit_succeeded_rate
- Description
- Number of messages successfully sent from the Cloudera Manager Agent to the Cloudera Host Monitor
- Unit
- messages per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
load_1
- Description
- Load Average over 1 minute
- Unit
- load average
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
load_15
- Description
- Load Average over 15 minute
- Unit
- load average
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
load_5
- Description
- Load Average over 5 minutes
- Unit
- load average
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
overcommit_ratio
- Description
- Percentage of physical RAM that the committed address space cannot exceed. Retrieved from /proc/sys/vm/overcommit_ratio.
- Unit
- percent
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_buffers
- Description
- The amount of physical memory devoted to temporary storage for raw disk blocks. This is the 'Buffers' field from /proc/meminfo.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_cached
- Description
- The amount of physical memory used for files read from the disk. This is commonly referred to as the pagecache. This is the 'Cached' field from /proc/meminfo.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_commit_limit
- Description
- Total amount of memory currently available to be allocated on the system. This is the 'CommitLimit' field from /proc/meminfo.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_dirty
- Description
- The total amount of memory waiting to be written back to the disk. This is the 'Dirty' field from /proc/meminfo.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_dirty_ratio
- Description
- Maximum percentage of physical memory that can be filled with dirty pages before processes are forced to write dirty buffers themselves during their time slice instead of being allowed to perform more writes. This is read from /proc/sys/vm/dirty_ratio.
- Unit
- percent
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_mapped
- Description
- The total amount of memory which has been used to map devices, files, or libraries using the mmap command. This is the 'Mapped' field from /proc/meminfo.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_memfree
- Description
- The amount of physical memory left unused by the system. This is the 'MemFree' field from /proc/meminfo.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_total
- Description
- The total physical memory available.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_used
- Description
- The total amount of memory being used, excluding buffers and cache.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
physical_memory_writeback
- Description
- The total amount of memory actively being written back to the disk. This is the 'Writeback' field from /proc/meminfo.
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
smon_message_bytes_sent_rate
- Description
- Number of bytes sent in messages from the Cloudera Manager Agent to the Cloudera Service Monitor
- Unit
- bytes per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
smon_message_transmit_duration
- Description
- The wall-clock time it took to transmit the most recent Cloudera Manager Agent message to the Cloudera Service Monitor
- Unit
- ms
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
smon_message_transmit_failed_rate
- Description
- Number of failures to send messages from the Cloudera Manager Agent to the Cloudera Service Monitor
- Unit
- messages per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
smon_message_transmit_succeeded_rate
- Description
- Number of messages successfully sent from the Cloudera Manager Agent to the Cloudera Service Monitor
- Unit
- messages per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
supervisord_cpu_system_rate
- Description
- Supervisord System CPU Time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
supervisord_cpu_user_rate
- Description
- Supervisord User CPU Time
- Unit
- seconds per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
supervisord_failures_rate
- Description
- The number of failures contacting supervisord seen by the Cloudera Manager Agent
- Unit
- failures per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
supervisord_fd_max
- Description
- Supervisord File Descriptor Max
- Unit
- file descriptors
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
supervisord_fd_open
- Description
- Supervisord File Descriptors
- Unit
- file descriptors
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
supervisord_latency
- Description
- The average latency contacting supervisord seen by the Cloudera Manager Agent
- Unit
- seconds
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
supervisord_physical_memory_used
- Description
- Supervisord physical memory used
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
supervisord_virtual_memory_used
- Description
- Supervisord virtual memory used
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
swap_free
- Description
- Swap free
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
swap_out_rate
- Description
- Memory swapped out to disk
- Unit
- pages per second
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
swap_total
- Description
- Swap capacity
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
swap_used
- Description
- Swap used
- Unit
- bytes
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_close
- Description
- The number of TCP connections in state CLOSE
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_close_wait
- Description
- The number of TCP connections in state CLOSE_WAIT
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_closing
- Description
- The number of TCP connections in state CLOSING
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_established
- Description
- The number of TCP connections in state ESTABLISHED
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_fin_wait1
- Description
- The number of TCP connections in state FIN_WAIT1
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_fin_wait2
- Description
- The number of TCP connections in state FIN_WAIT2
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_last_ack
- Description
- The number of TCP connections in state LAST_ACK
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_listen
- Description
- The number of TCP connections in state LISTEN
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_syn_recv
- Description
- The number of TCP connections in state SYN_RECV
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_syn_sent
- Description
- The number of TCP connections in state SYN_SENT
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
tcp_connection_count_time_wait
- Description
- The number of TCP connections in state TIME_WAIT
- Unit
- connections
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]
uptime
- Description
- For a host, the amount of time since the host was booted. For a role, the uptime of the backing process.
- Unit
- seconds
- Parents
- cluster, rack
- CDH Version
- [CM -1.0.0..CM -1.0.0]