You must configure the dfs.datanode.max.transfer.threads with HBase
to specify the maximum number of files that a DataNode can serve at any one time.
A Hadoop HDFS DataNode has an upper bound on the number of files that it can serve at
any one time. The upper bound is controlled by the
dfs.datanode.max.transfer.threads property (the property is spelled in
the code exactly as shown here). Before loading, make sure you have configured the value
for dfs.datanode.max.transfer.threads in the
conf/hdfs-site.xml file (by default found in
/etc/hadoop/conf/hdfs-site.xml) to at least 4096 as
shown here:
Restart HDFS after changing the value for
dfs.datanode.max.transfer.threads. If the value is not set to an
appropriate value, strange failures can occur and an error message about exceeding the
number of transfer threads will be added to the DataNode logs. Other error messages about
missing blocks are also logged, such as:
06/12/14 20:10:31 INFO hdfs.DFSClient: Could not obtain block blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node:
java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...