Hadoop 2.4 introduced a new feature called hedged reads. If a read from a block is slow, the HDFS client starts up another parallel, 'hedged' read against a different block replica. The result of whichever read returns first is used, and the outstanding read is cancelled. This feature helps in situations where a read occasionally takes a long time rather than when there is a systemic problem. Hedged reads can be enabled for HBase when the HFiles are stored in HDFS. This feature is disabled by default.
Enabling Hedged Reads for HBase
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
- Go to the HBase service.
- Click the Configuration tab.
- Select .
- Select .
- Configure the HDFS Hedged Read Threadpool Size and HDFS Hedged Read Delay Threshold properties. The descriptions for each of these properties on the configuration pages provide more information.
- Enter a Reason for change, and then click Save Changes to commit the changes.
Monitoring the Performance of Hedged Reads
You can monitor the performance of hedged reads using the following metrics emitted by Hadoop when hedged reads are enabled.
- hedgedReadOps - the number of hedged reads that have occurred
- hedgeReadOpsWin - the number of times the hedged read returned faster than the original read