MachineLearningKubernetesNonCriticalPodNotHealthy
Reference information for MachineLearningKubernetesNonCriticalPodNotHealthy used as part of Pulse in CDP Private Cloud Data Services.
- Summary
- Machine Learning Pod has been in a non-ready state for longer than an hour
- PromQL Expression
- min_over_time(sum by (namespace, pod) (kube_pod_status_phase{phase=~"Pending|Unknown|Failed", appType="ml", pod!~"{{ .Values.alerts.mlCriticalPodsRegexp }}"})[10m:]) > 0
- Time Period
- 0s
- Severity
- warning
- Source
- environments/ml