Oozie Metrics

Reference information for Oozie Metrics

In addition to these base metrics, many aggregate metrics are available. If an entity type has parents defined, you can formulate all possible aggregate metrics using the formula base_metric_across_parents.

In addition, metrics for aggregate totals can be formed by adding the prefix total_ to the front of the metric name.

Use the type-ahead feature in the Cloudera Manager chart browser to find the exact aggregate metric name, in case the plural form does not end in "s".

For example, the following metric names may be valid for Oozie:

  • jobs_failed_rate_across_clusters
  • total_jobs_failed_rate_across_clusters

Some metrics, such as alerts_rate, apply to nearly every metric context. Others only apply to a certain service or role.

jobs_failed_rate

Description
Jobs Failed
Unit
jobs per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 5.3), [CDH 5.3..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0)

jobs_killed_rate

Description
Jobs Killed
Unit
jobs per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 5.3), [CDH 5.3..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0)

jobs_preparing

Description
Jobs Preparing
Unit
jobs
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 5.3), [CDH 5.3..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0)

jobs_running

Description
Jobs Running
Unit
jobs
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 5.3), [CDH 5.3..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0)

jobs_succeeded_rate

Description
Jobs Succeeded
Unit
jobs per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 5.3), [CDH 5.3..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0)

jobs_suspended

Description
Jobs Suspended
Unit
jobs
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 5.3), [CDH 5.3..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0)

alerts_rate

Description
The number of alerts.
Unit
events per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

events_critical_rate

Description
The number of critical events.
Unit
events per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

events_important_rate

Description
The number of important events.
Unit
events per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

events_informational_rate

Description
The number of informational events.
Unit
events per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_bad_rate

Description
Percentage of Time with Bad Health
Unit
seconds per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_concerning_rate

Description
Percentage of Time with Concerning Health
Unit
seconds per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_disabled_rate

Description
Percentage of Time with Disabled Health
Unit
seconds per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_good_rate

Description
Percentage of Time with Good Health
Unit
seconds per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]

health_unknown_rate

Description
Percentage of Time with Unknown Health
Unit
seconds per second
Parents
cluster
CDH Version
[CDH 5.0.0..CDH 6.0.0), [CDH 6.0.0..CDH 7.0.0), [CDH 7.0.0..CDH 8.0.0), [CM -1.0.0..CM -1.0.0]