Properties for configuring short-circuit local reads on HDFS

To configure short-circuit local reads, you must add various properties to the hdfs-site.xml file. Short-circuit local reads must be configured on both the DataNode and the client.


Property Name	Property Value	Description
`dfs.client.read.shortcircuit`	true	Set this to true to enable short-circuit local reads.
`dfs.domain.socket.path`	/var/lib/hadoop-hdfs/ dn_socket	The path to the domain socket. Short-circuit reads make use of a UNIX domain socket. This is a special path in the file system that allows the client and the DataNodes to communicate. You will need to set a path to this socket. The DataNode needs to be able to create this path. On the other hand, it should not be possible for any user except the hdfs user or root to create this path. For this reason, paths under /var/run or /var/lib are often used. In the file system that allows the client and the DataNodes to communicate. You will need to set a path to this socket. The DataNode needs to be able to create this path. On the other hand, it should not be possible for any user except the hdfs user or root to create this path. For this reason, paths under /var/run or /var/lib are often used.
`dfs.client.domain.socket.data.traffic`	false	This property controls whether or not normal data traffic will be passed through the UNIX domain socket. It is recommended that you set the value of this property to `false`. Unusual data traffic will be passed through the UNIX domain socket.
`dfs.client.use.legacy.blockreader.local`	false	Setting this value to `false` specifies that the new version (based on HDFS-347) of the short-circuit reader is used. Setting this value to `true` would mean that the legacy short-circuit reader would be used.
`dfs.datanode.hdfs-blocks-metadata.enabled`	true	Boolean which enables back-end DataNode-side support for the experimental `DistributedFileSystem#getFile` `VBlockStorageLocations`API.
`dfs.client.file-block-storage-locations.timeout`	60	Timeout (in seconds) for the parallel RPCs made in `DistributedFileSystem` `#getFileBlockStorageLocations()`. This property is deprecated but is still supported for backward compatibility
`dfs.client.file-block-storage-locations.timeout.millis`	60000	Timeout (in milliseconds) for the parallel RPCs made in `DistributedFileSystem` `#getFileBlockStorageLocations()`. This property replaces `dfs.client.file-block-storage-locations.timeout`, and offers a finer level of granularity.
`dfs.client.read.shortcircuit.skip.checksum`	false	If this configuration parameter is set, short-circuit local reads will skip checksums. This is normally not recommended, but it may be useful for special setups. You might consider using this if you are doing your own checksumming outside of HDFS.
`dfs.client.read.shortcircuit.streams.cache.size`	256	The DFSClient maintains a cache of recently opened file descriptors. This parameter controls the size of that cache. Setting this higher will use more file descriptors, but potentially provide better performance on workloads involving many seeks.
`dfs.client.read.shortcircuit.streams.cache.expiry.ms`	300000	This controls the minimum amount of time (in milliseconds) file descriptors need to sit in the client cache context before they can be closed for being inactive for too long.

The XML for these entries:

<configuration>

<property>
  <name>dfs.client.read.shortcircuit</name>
  <value>true</value>
</property>
 
<property>
  <name>dfs.domain.socket.path</name>
  <value>/var/lib/hadoop-hdfs/dn_socket</value>
</property>
 
<property>
  <name>dfs.client.domain.socket.data.traffic</name>
  <value>false</value>
</property>
 
<property>
  <name>dfs.client.use.legacy.blockreader.local</name>
  <value>false</value>
</property>
 
<property>
  <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
  <value>true</value>
</property>
 
<property>
  <name>dfs.client.file-block-storage-locations.timeout.millis</name>
  <value>60000</value>
</property>
 
<property>
  <name>dfs.client.read.shortcircuit.skip.checksum</name>
  <value>false</value>
</property>
 
<property>
  <name>dfs.client.read.shortcircuit.streams.cache.size</name>
  <value>256</value>
</property>
 
<property>
  <name>dfs.client.read.shortcircuit.streams.cache.expiry.ms</name>
  <value>300000</value>
</property>

</configuration>