MachineLearningKubernetesCriticalPodCrashLooping
Reference information for MachineLearningKubernetesCriticalPodCrashLooping used as part of Pulse in CDP Private Cloud Data Services.
- Summary
- Critical Machine Learning Pod is crash looping
- PromQL Expression
- increase(kube_pod_container_status_restarts_total{pod=~"{{ .Values.alerts.mlCriticalPodsRegexp }}", appType="ml"} [15m]) > 2
- Time Period
- 0s
- Severity
- critical
- Source
- environments/ml