Tuning the HDFS Block Size for DSSD Mode

When you enable DSSD Mode, the value for the HDFS Block Size parameter is set to 512 MB (the default value for non-DSSD mode is 128 MB). Although this value is optimal for most deployments and for most types of workloads, you can change this value to improve performance, or to maximize the available storage capacity in the DSSD D5 .

HBase workloads can efficiently use this block size, but some Impala workloads may run more slowly because, due to the larger block size, the data is less distributed and benefits less from parallel processing.

A DSSD D5 can store a maximum of 6.7 million blocks regardless of block size. If your workload consistently stores less then the average block size (the capacity of the DSSD D5 divided by 6.7 million), it is possible that your cluster will run out of objects and therefore will not be able to store an amount of data equal to the full capacity of the appliance.

As the amount of data per block decreases, the available capacity of the DSSD D5 also decreases because the blocks are used less efficiently. If you decrease the value of the HDFS Block Size parameter, you increase the maximum number of blocks that you can store in the DSSD D5 appliance.