Viewing Current Disk Usage by User, Group, or Directory

The current disk usage reports show "current" disk usage in both chart and tabular form.

The data for these reports comes from the fsimage kept on the NameNode, so the data in a report will be only as current as when the last checkpoint was performed. Typically the checkpoint interval is (by default) once per hour, but if checkpoints are not being performed as frequently, the disk usage report may not be up to date. The disk usage report displays the current usage and does not account for deleted files that only exist in snapshots. These files are included in the usage information when you run the du command.

To create a disk usage report:

  • Click the report name (link) to produce the resulting report.

Each of these reports show:

Bytes

The logical number of bytes in the files, aggregated by user, group, or directory. This is based on the actual files sizes, not taking replication into account.

Raw Bytes

The physical number of bytes (total disk space in HDFS) used by the files aggregated by user, group, or directory. This does include replication, and so is actually Bytes times the number of replicas.

File and Directory Count

The number of files aggregated by user, group, or directory.

Bytes and Raw Bytes are shown in IEC binary prefix notation (1 GiB = 1 * 230).

The directories shown in the Current Disk Usage by Directory report are the HDFS directories you have set as watched directories. You can add or remove directories to or from the watch list from this report; click the Search Files and Manage Directories button at the top right of the set of reports for the cluster or nameservice.

The report data is also shown in chart format:

  • Move the cursor over the graph to highlight a specific period on the graph and see the actual value (data size) for that period.
  • You can also move the cursor over the user, group, or directory name (in the graph legend) to highlight the portion of the graph for that name.
  • You can right-click within the chart area to save the whole chart display as a single image (a .PNG file) or as a PDF file. You can also print to the printer configured for your browser.