cdp-doctor system process

Scope

The cdp-doctor system process command provides a detailed view of the top 10 CPU and memory-consuming processes running on a CDP node. It helps identify resource-intensive services, potential memory leaks, or high CPU utilization that could impact cluster performance. It is particularly useful for performance troubleshooting, capacity planning, and node-level diagnostics.

The cdp-doctor system process command displays system processes sorted by resource consumption, along with the following key performance indicators:

Use Case

  • Investigating node performance degradation or service slowness.
  • Checking which services are consuming excessive CPU or memory.
  • Performing post-deployment validation or resource audits.
  • Preparing data for capacity tuning or troubleshooting with Cloudera Support.

Sample Output

Running the cdp-doctor system process command displays the following output:

Top 10 Memory/CPU consumer processes:
+---------+------+--------------+---------+------+---------+--------+------------+------------------+
|   PID   | Name |     User     | MEMORY% | CPU% |   VMS   |  RSS   | Open Files | Open Connections |
+---------+------+--------------+---------+------+---------+--------+------------+------------------+
|  24576  | java | cloudera-scm |  11.2   | 6.1  | 14.3 GB | 7.0 GB |    474     |        80        |
|  53157  | java |     hive     |   9.1   | 0.2  | 10.3 GB | 5.7 GB |    589     |        36        |
|  63162  | java | cloudera-scm |   4.9   | 3.0  | 8.7 GB  | 3.0 GB |    863     |        22        |
|  49460  | java |     solr     |   4.2   | 0.8  | 8.1 GB  | 2.6 GB |    426     |        32        |
| 1038439 | java |    atlas     |   2.8   | 0.8  | 7.5 GB  | 1.8 GB |    702     |        17        |
|  63043  | java | cloudera-scm |   5.1   | 0.6  | 9.8 GB  | 3.2 GB |    604     |        3         |
|  37158  | java |     hdfs     |   2.9   | 0.1  | 4.2 GB  | 1.9 GB |    274     |        3         |
+---------+------+--------------+---------+------+---------+--------+------------+------------------+
Column Description
PID Process ID — unique identifier of the running process.
Name Process name (e.g., java, jsvc, etc.)
User OS user running the process (e.g., cloudera-scm, hdfs, hive).
MEMORY%/CPU% Percentage of total system memory and CPU consumed.
VMS/RSS Virtual Memory Size and Resident Set Size (actual memory usage).
Open Files Number of file descriptors opened by the process.
Open Connections Number of active network connections.
  • High MEMORY% / CPU% values indicate the process is resource-intensive (e.g., Cloudera Manager, HiveServer2, Solr).
  • High VMS / RSS values suggest heavy memory allocation. It may require JVM tuning or heap adjustment.
  • A high number of Open Files or Open Connections may indicate too many threads, open sockets, or log file handles.
  • Multiple high-usage Java processes: Normal for Data Lake or Data Hub nodes running multiple CDP services.