Resource usage by nodes
The Resource Usage chart in Cloudera Observability displays time-series metrics for each node in the current workload, providing insights into CPU, memory, GPU, and GPU memory. It helps identify imbalances or bottlenecks across nodes. High utilization of specific nodes may indicate the need to rebalance workloads or scale resources.
- CPU: Provides a historical view of CPU usage with individual workload node granularity. Node CPU usage is the number of CPU cores used by all pods running on that node. Hover over to view CPU usage compared to allocated CPU usage.
- Memory: Provides a historical view of memory usage within the selected ML workload node. Node memory usage is the total memory usage of all pods. Hover over to view memory used compared to allocated memory.
- GPU: Provides a historical view of GPU usage with individual workload node granularity. Hover over to view GPU usage compared to allocated GPU usage in cores.
- GPU Memory: Provides a historical view of GPU memory usage with individual workload node granularity. Hover over to view GPU memory usage compared to allocated GPU usage in MiB.