Advanced configuration for write-heavy workloads
HBase includes several advanced configuration parameters for adjusting the number of threads available to service flushes and compactions in the presence of write-heavy workloads. Tuning these parameters incorrectly can severely degrade performance and is not necessary for most HBase clusters. If you use Cloudera Manager, configure these options using the HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml
- The number of threads available to flush writes from memory to disk. Never increase
hbase.hstore.flusher.countto more of 50% of the number of disks available to HBase. For example, if you have 8 solid-state drives (SSDs),
hbase.hstore.flusher.countshould never exceed 4. This allows scanners and compactions to proceed even in the presence of very high writes.
- The number of threads available to handle small and large compactions, respectively. Never increase either of these options to more than 50% of the number of disks available to HBase.
hbase.regionserver.thread.compaction.smallshould be greater than or equal to
hbase.regionserver.thread.compaction.large, since the large compaction threads do more intense work and will be in use longer for a given operation.
In addition to the above, if you use compression on some column families, more CPU will be used when flushing these column families to disk during flushes or compaction. The impact on CPU usage depends on the size of the flush or the amount of data to be decompressed and compressed during compactions.