When slow DataNode detection is enabled, DataNodes collect latency statistics on
their peers during write pipelines, and use periodic outlier detection to determine slow
peers. The NameNode aggregates reports from all DataNodes and flags potentially slow nodes.
Slow DataNode detection is disabled by default.
-
To enable slow DataNode detection, set the value of the
dfs.datanode.peer.stats.enabled
property to
true
in hdfs-site.xml
.
<property>
<name>dfs.datanode.peer.stats.enabled</name>
<value>true</value>
</property>
-
Access the slow DataNode statistics either from the NameNode JMX page at
http://<namenode_host>:50070/jmx
or from the DataNode
JMX page at http://<datanode_host>:50075/jmx
.
In the following JMX output example, the time unit is milliseconds, and the
peer DataNodes are healthy because the latencies are in
milliseconds:
"name" : "Hadoop:service=DataNode,name=DataNodeInfo",
"modelerType" : "org.apache.hadoop.hdfs.server.datanode.DataNode", "SendPacketDownstreamAvgInfo" : "{
\"[192.168.7.202:50075]RollingAvgTime\" : 1.4476967370441458,
\"[192.168.7.201:50075]RollingAvgTime\" : 1.5569170444798432
}"