KRaft Controller Metrics

Reference information for KRaft Controller Metrics

In addition to these base metrics, many aggregate metrics are available. If an entity type has parents defined, you can formulate all possible aggregate metrics using the formula base_metric_across_parents.

In addition, metrics for aggregate totals can be formed by adding the prefix total_ to the front of the metric name.

Use the type-ahead feature in the Cloudera Manager chart browser to find the exact aggregate metric name, in case the plural form does not end in "s".

For example, the following metric names may be valid for KRaft Controller:

  • alerts_rate_across_clusters
  • total_alerts_rate_across_clusters

Some metrics, such as alerts_rate, apply to nearly every metric context. Others only apply to a certain service or role.

alerts_rate

Description
The number of alerts.
Unit
events per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_cpu_system_rate

Description
CPU usage of the role's cgroup
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_cpu_user_rate

Description
User Space CPU usage of the role's cgroup
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_mem_page_cache

Description
Page cache usage of the role's cgroup
Unit
bytes
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_mem_rss

Description
Resident memory of the role's cgroup
Unit
bytes
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_mem_swap

Description
Swap usage of the role's cgroup
Unit
bytes
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_read_bytes_rate

Description
Bytes read from all disks by the role's cgroup
Unit
bytes per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_read_ios_rate

Description
Number of read I/O operations from all disks by the role's cgroup
Unit
ios per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_write_bytes_rate

Description
Bytes written to all disks by the role's cgroup
Unit
bytes per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cgroup_write_ios_rate

Description
Number of write I/O operations to all disks by the role's cgroup
Unit
ios per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cpu_system_rate

Description
Total System CPU
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

cpu_user_rate

Description
Total CPU user time
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

events_critical_rate

Description
The number of critical events.
Unit
events per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

events_important_rate

Description
The number of important events.
Unit
events per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

events_informational_rate

Description
The number of informational events.
Unit
events per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

fd_max

Description
Maximum number of file descriptors
Unit
file descriptors
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

fd_open

Description
Open file descriptors.
Unit
file descriptors
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_bad_rate

Description
Percentage of Time with Bad Health
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_concerning_rate

Description
Percentage of Time with Concerning Health
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_disabled_rate

Description
Percentage of Time with Disabled Health
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_good_rate

Description
Percentage of Time with Good Health
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_unknown_rate

Description
Percentage of Time with Unknown Health
Unit
seconds per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

mem_rss

Description
Resident memory used
Unit
bytes
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

mem_swap

Description
Amount of swap memory used by this role's process.
Unit
bytes
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

mem_virtual

Description
Virtual memory used
Unit
bytes
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

oom_exits_rate

Description
The number of times the role's backing process was killed due to an OutOfMemory error. This counter is only incremented if the Cloudera Manager "Kill When Out of Memory" option is enabled.
Unit
exits per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

read_bytes_rate

Description
The number of bytes read from the device
Unit
bytes per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

unexpected_exits_rate

Description
The number of times the role's backing process exited unexpectedly.
Unit
exits per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

uptime

Description
For a host, the amount of time since the host was booted. For a role, the uptime of the backing process.
Unit
seconds
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

write_bytes_rate

Description
The number of bytes written to the device
Unit
bytes per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

kafka_active_broker_count

Description
Number of active brokers in the cluster
Unit
message.units.broker
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_active_controller

Description
Will be 1 if this instance is the active controller, 0 otherwise
Unit
message.units.controller
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_fenced_broker_count

Description
Number of fenced brokers in the cluster
Unit
message.units.broker
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_global_partition_count

Description
Total number of partitions in the cluster
Unit
partitions
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_global_topic_count

Description
Total number of topics in the cluster
Unit
message.units.topics
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_cms_runs_rate

Description
Number of ConcurrentMarkSweep garbage collector runs performed on this broker
Unit
runs per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_cms_time

Description
Time spent in ConcurrentMarkSweep garbage collection on this broker
Unit
message.units.milliseconds
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_g1_old_runs_rate

Description
Number of G1 Old Generation garbage collector runs performed on this broker
Unit
runs per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_g1_old_time

Description
Time spent in G1 Old Generation garbage collection on this broker
Unit
message.units.milliseconds
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_g1_young_runs_rate

Description
Number of G1 Young Generation garbage collector runs performed on this broker
Unit
runs per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_g1_young_time

Description
Time spent in G1 Young Generation garbage collection on this broker
Unit
message.units.milliseconds
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_parnew_runs_rate

Description
Number of ParNew garbage collector runs performed on this broker
Unit
runs per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_parnew_time

Description
Time spent in ParNew garbage collection on this broker
Unit
message.units.milliseconds
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_ps_ms_runs_rate

Description
Number of PS MarkSweep garbage collector runs performed on this broker
Unit
runs per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_ps_ms_time

Description
Time spent in PS MarkSweep garbage collection on this broker
Unit
message.units.milliseconds
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_ps_scavenge_runs_rate

Description
Number of PS Scavenge garbage collector runs performed on this broker
Unit
runs per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_ps_scavenge_time

Description
Time spent in PS Scavenge garbage collection on this broker
Unit
message.units.milliseconds
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_runs_rate

Description
Number of garbage collector runs performed on this broker
Unit
runs per second
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_jvm_gc_time

Description
Time spent in garbage collection on this broker
Unit
message.units.milliseconds
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_offline_partitions

Description
Number of unavailable partitions
Unit
partitions
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)

kafka_preferred_replica_imbalance

Description
Number of partitions where the lead replica is not the preferred replica
Unit
partitions
Parents
cluster, kafka, rack
CDH Version
[CDH 5.0.0..CDH 8.0.0)