Monitoring Built-in AlertsPDF version

MachineLearningKubernetesNonCriticalPodNotHealthy

Reference information for MachineLearningKubernetesNonCriticalPodNotHealthy used as part of Pulse in Cloudera Data Services on premises.

Summary
Machine Learning Pod has been in a non-ready state for longer than an hour
PromQL Expression
min_over_time(sum by (namespace, pod) (kube_pod_status_phase{phase=~"Pending|Unknown|Failed", appType="ml", pod!~"{{ .Values.alerts.mlCriticalPodsRegexp }}"})[10m:]) > 0
Time Period
0s
Severity
warning
Source
environments/ml

We want your opinion

How can we improve this page?

What kind of feedback do you have?