Host Metrics

Metric Name Description Unit CDH Version
agent_cert_expiry Remaining days until the expiry of the certificate of Cloudera Manager Agent seconds [CM -1.0.0..CM -1.0.0]
agent_cpu_system_rate Cloudera Manager Agent System CPU Time seconds per second [CM -1.0.0..CM -1.0.0]
agent_cpu_user_rate Cloudera Manager Agent User CPU Time seconds per second [CM -1.0.0..CM -1.0.0]
agent_fd_max Cloudera Manager Agent File Descriptor Max file descriptors [CM -1.0.0..CM -1.0.0]
agent_fd_open Cloudera Manager Agent File Descriptors file descriptors [CM -1.0.0..CM -1.0.0]
agent_hb_latency_millis Heartbeat latency observed by Cloudera Manager Agent communicating to Cloudera Manager Server ms [CM -1.0.0..CM -1.0.0]
agent_physical_memory_used Agent physical memory used bytes [CM -1.0.0..CM -1.0.0]
agent_virtual_memory_used Agent virtual memory used bytes [CM -1.0.0..CM -1.0.0]
alerts_rate The number of alerts. events per second [CM -1.0.0..CM -1.0.0]
available_entropy The entropy that is available on the host entropy [CM -1.0.0..CM -1.0.0]
clock_offset Clock offset as reported by the host's NTP service from 'ntpdc -np' or 'chronyc sources'. If NTP is not in use, this metric is not collected. ms [CM -1.0.0..CM -1.0.0]
cores Logical CPU Cores cores [CM -1.0.0..CM -1.0.0]
cpu_guest_nice_rate Time spent running a niced guest (virtual CPU for guest operating systems under the control of the Linux kernel). Requires Linux 2.6.33. CPU guest nice time is included in CPU nice time. seconds per second [CM -1.0.0..CM -1.0.0]
cpu_guest_rate Time spent running a virtual CPU for guest operating systems under the control of the Linux kernel. Requires Linux 2.6.24. CPU guest time is included in CPU user time. seconds per second [CM -1.0.0..CM -1.0.0]
cpu_idle_rate Total CPU idle time seconds per second [CM -1.0.0..CM -1.0.0]
cpu_iowait_rate Total CPU iowait time seconds per second [CM -1.0.0..CM -1.0.0]
cpu_irq_rate Total CPU IRQ time seconds per second [CM -1.0.0..CM -1.0.0]
cpu_nice_rate Total CPU nice time seconds per second [CM -1.0.0..CM -1.0.0]
cpu_percent Total CPU usage of the host (averaged since last report) percent [CM -1.0.0..CM -1.0.0]
cpu_soft_irq_rate Total CPU soft IRQ time seconds per second [CM -1.0.0..CM -1.0.0]
cpu_steal_rate Stolen time, which is the time spent in other operating systems when running in a virtualized environment. Requires Linux 2.6.11. seconds per second [CM -1.0.0..CM -1.0.0]
cpu_system_rate Total System CPU seconds per second [CM -1.0.0..CM -1.0.0]
cpu_user_rate Total CPU user time seconds per second [CM -1.0.0..CM -1.0.0]
dns_name_resolution_duration The duration of a call to InetAddress.getLocalHost() in a helper java process run by the Cloudera Manager Agent. ms [CM -1.0.0..CM -1.0.0]
events_critical_rate The number of critical events. events per second [CM -1.0.0..CM -1.0.0]
events_important_rate The number of important events. events per second [CM -1.0.0..CM -1.0.0]
events_informational_rate The number of informational events. events per second [CM -1.0.0..CM -1.0.0]
fd_max Maximum number of file descriptors file descriptors [CM -1.0.0..CM -1.0.0]
fd_open Open file descriptors. file descriptors [CM -1.0.0..CM -1.0.0]
health_bad_rate Percentage of Time with Bad Health seconds per second [CM -1.0.0..CM -1.0.0]
health_concerning_rate Percentage of Time with Concerning Health seconds per second [CM -1.0.0..CM -1.0.0]
health_disabled_rate Percentage of Time with Disabled Health seconds per second [CM -1.0.0..CM -1.0.0]
health_good_rate Percentage of Time with Good Health seconds per second [CM -1.0.0..CM -1.0.0]
health_unknown_rate Percentage of Time with Unknown Health seconds per second [CM -1.0.0..CM -1.0.0]
hmon_message_bytes_sent_rate Number of bytes sent in messages from the Cloudera Manager Agent to the Cloudera Host Monitor bytes per second [CM -1.0.0..CM -1.0.0]
hmon_message_transmit_duration The wall-clock time it took to transmit the most recent Cloudera Manager Agent message to the Cloudera Host Monitor ms [CM -1.0.0..CM -1.0.0]
hmon_message_transmit_failed_rate Number of failures to send messages from the Cloudera Manager Agent to the Cloudera Host Monitor messages per second [CM -1.0.0..CM -1.0.0]
hmon_message_transmit_succeeded_rate Number of messages successfully sent from the Cloudera Manager Agent to the Cloudera Host Monitor messages per second [CM -1.0.0..CM -1.0.0]
load_1 Load Average over 1 minute load average [CM -1.0.0..CM -1.0.0]
load_15 Load Average over 15 minute load average [CM -1.0.0..CM -1.0.0]
load_5 Load Average over 5 minutes load average [CM -1.0.0..CM -1.0.0]
overcommit_ratio Percentage of physical RAM that the committed address space cannot exceed. Retrieved from /proc/sys/vm/overcommit_ratio. percent [CM -1.0.0..CM -1.0.0]
physical_memory_buffers The amount of physical memory devoted to temporary storage for raw disk blocks. This is the 'Buffers' field from /proc/meminfo. bytes [CM -1.0.0..CM -1.0.0]
physical_memory_cached The amount of physical memory used for files read from the disk. This is commonly referred to as the pagecache. This is the 'Cached' field from /proc/meminfo. bytes [CM -1.0.0..CM -1.0.0]
physical_memory_commit_limit Total amount of memory currently available to be allocated on the system. This is the 'CommitLimit' field from /proc/meminfo. bytes [CM -1.0.0..CM -1.0.0]
physical_memory_dirty The total amount of memory waiting to be written back to the disk. This is the 'Dirty' field from /proc/meminfo. bytes [CM -1.0.0..CM -1.0.0]
physical_memory_dirty_ratio Maximum percentage of physical memory that can be filled with dirty pages before processes are forced to write dirty buffers themselves during their time slice instead of being allowed to perform more writes. This is read from /proc/sys/vm/dirty_ratio. percent [CM -1.0.0..CM -1.0.0]
physical_memory_mapped The total amount of memory which has been used to map devices, files, or libraries using the mmap command. This is the 'Mapped' field from /proc/meminfo. bytes [CM -1.0.0..CM -1.0.0]
physical_memory_memfree The amount of physical memory left unused by the system. This is the 'MemFree' field from /proc/meminfo. bytes [CM -1.0.0..CM -1.0.0]
physical_memory_total The total physical memory available. bytes [CM -1.0.0..CM -1.0.0]
physical_memory_used The total amount of memory being used, excluding buffers and cache. bytes [CM -1.0.0..CM -1.0.0]
physical_memory_writeback The total amount of memory actively being written back to the disk. This is the 'Writeback' field from /proc/meminfo. bytes [CM -1.0.0..CM -1.0.0]
smon_message_bytes_sent_rate Number of bytes sent in messages from the Cloudera Manager Agent to the Cloudera Service Monitor bytes per second [CM -1.0.0..CM -1.0.0]
smon_message_transmit_duration The wall-clock time it took to transmit the most recent Cloudera Manager Agent message to the Cloudera Service Monitor ms [CM -1.0.0..CM -1.0.0]
smon_message_transmit_failed_rate Number of failures to send messages from the Cloudera Manager Agent to the Cloudera Service Monitor messages per second [CM -1.0.0..CM -1.0.0]
smon_message_transmit_succeeded_rate Number of messages successfully sent from the Cloudera Manager Agent to the Cloudera Service Monitor messages per second [CM -1.0.0..CM -1.0.0]
supervisord_cpu_system_rate Supervisord System CPU Time seconds per second [CM -1.0.0..CM -1.0.0]
supervisord_cpu_user_rate Supervisord User CPU Time seconds per second [CM -1.0.0..CM -1.0.0]
supervisord_failures_rate The number of failures contacting supervisord seen by the Cloudera Manager Agent failures per second [CM -1.0.0..CM -1.0.0]
supervisord_fd_max Supervisord File Descriptor Max file descriptors [CM -1.0.0..CM -1.0.0]
supervisord_fd_open Supervisord File Descriptors file descriptors [CM -1.0.0..CM -1.0.0]
supervisord_latency The average latency contacting supervisord seen by the Cloudera Manager Agent seconds [CM -1.0.0..CM -1.0.0]
supervisord_physical_memory_used Supervisord physical memory used bytes [CM -1.0.0..CM -1.0.0]
supervisord_virtual_memory_used Supervisord virtual memory used bytes [CM -1.0.0..CM -1.0.0]
swap_free Swap free bytes [CM -1.0.0..CM -1.0.0]
swap_out_rate Memory swapped out to disk pages per second [CM -1.0.0..CM -1.0.0]
swap_total Swap capacity bytes [CM -1.0.0..CM -1.0.0]
swap_used Swap used bytes [CM -1.0.0..CM -1.0.0]
tcp_connection_count_close The number of TCP connections in state CLOSE connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_close_wait The number of TCP connections in state CLOSE_WAIT connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_closing The number of TCP connections in state CLOSING connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_established The number of TCP connections in state ESTABLISHED connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_fin_wait1 The number of TCP connections in state FIN_WAIT1 connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_fin_wait2 The number of TCP connections in state FIN_WAIT2 connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_last_ack The number of TCP connections in state LAST_ACK connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_listen The number of TCP connections in state LISTEN connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_syn_recv The number of TCP connections in state SYN_RECV connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_syn_sent The number of TCP connections in state SYN_SENT connections [CM -1.0.0..CM -1.0.0]
tcp_connection_count_time_wait The number of TCP connections in state TIME_WAIT connections [CM -1.0.0..CM -1.0.0]
uptime For a host, the amount of time since the host was booted. For a role, the uptime of the backing process. seconds [CM -1.0.0..CM -1.0.0]