Using SmartSense
Also available as:

HDFS dashboard

The HDFS dashboard helps operators better understand how HDFS is being used and which users and jobs are consuming the most resources within the file system.

This dashboard includes the following paragraphs:

  • File Size Distribution

  • Users with Maximum Small Files

  • Users with Maximum HDFS Utilization
  • HDFS File Size Trend

  • HDFS Utilization Trend

Most of these paragraphs have titles that are self-explanatory. A few of them are described below to provide more context:

Paragraph Description
File Size Distribution

For any large multi-tenant cluster, it’s important to identify and keep the proliferation of small files in check. The paragraph displays a pie chart showing the relative distribution of files by file size categorized by Tiny (0-10K), Mini (10K-1M), Medium (30M-128M), and Large (128M+) files.

The goal is to show how dominant specific file size categories are within HDFS. If there are many small files, you can easily identify (in the next paragraph) who is contributing to those small files.

Users with Maximum Small Files

Understanding how prevalent files of specific sizes are is helpful, but the next step is understanding who is responsible for creating those files. The goal of this paragraph is to show who is responsible for creating the majority of small files within HDFS.