Scaling Namespaces and Optimizing Data Storage
Also available as:
PDF
loading table of contents...

Enable disk IO statistics

Disk IO statistics are disabled by default. To enable disk IO statistics, you must set the file IO sampling percentage to a non-zero value in the hdfs-site.xml file.

  1. Set the dfs.datanode.fileio.profiling.sampling.percentage property to a non-zero value in hdfs-site.xml.
    
    <property>
      <name>dfs.datanode.fileio.profiling.sampling.fraction</name>
      <value>25</value>
    </property>
    Note
    Note
    Sampling disk IO might have a minor impact on cluster performance.
  2. Access the disk IO statistics from the NameNode JMX page at http://<namenode_host>:50070/jmx.
    In the following JMX output example, the time unit is milliseconds, and the disk is healthy because the IO latencies are low:
    
        "name" : "Hadoop:service=DataNode,name=DataNodeVolume-/data/disk2/dfs/data/",
        "modelerType" : "DataNodeVolume-/data/disk2/dfs/data/",
        "tag.Context" : "dfs",
        "tag.Hostname" : "n001.hdfs.example.com",
        "TotalMetadataOperations" : 67,
        "MetadataOperationRateAvgTime" : 0.08955223880597014,
      ...
        "WriteIoRateNumOps" : 7321,
        "WriteIoRateAvgTime" : 0.050812730501297636