Hedged Reads

Hadoop 2.4 introduced a new feature called hedged reads. If a read from a block is slow, the HDFS client starts up another parallel, 'hedged' read against a different block replica. The result of whichever read returns first is used, and the outstanding read is cancelled. This feature helps in situations where a read occasionally takes a long time rather than when there is a systemic problem. Hedged reads can be enabled for HBase when the HFiles are stored in HDFS. This feature is disabled by default.

Enabling Hedged Reads for HBase

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  1. Go to the HBase service.
  2. Click the Configuration tab.
  3. Select Scope > HBASE-1 (Service-Wide).
  4. Select Category > Performance.
  5. Configure the HDFS Hedged Read Threadpool Size and HDFS Hedged Read Delay Threshold properties. The descriptions for each of these properties on the configuration pages provide more information.
  6. Enter a Reason for change, and then click Save Changes to commit the changes.

Monitoring the Performance of Hedged Reads

You can monitor the performance of hedged reads using the following metrics emitted by Hadoop when hedged reads are enabled.
  • hedgedReadOps - the number of hedged reads that have occurred
  • hedgeReadOpsWin - the number of times the hedged read returned faster than the original read