Advanced configuration for write-heavy workloads
HBase includes several advanced configuration parameters for adjusting the number of threads available to service flushes and compactions in the presence of write-heavy workloads. Tuning these parameters incorrectly can severely degrade performance and is not necessary for most HBase clusters. If you use Cloudera Manager, configure these options using the HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml
hbase.hstore.flusher.count
- The number of threads available to flush writes from memory to disk. Never increase
hbase.hstore.flusher.count
to more of 50% of the number of disks available to HBase. For example, if you have 8 solid-state drives (SSDs),hbase.hstore.flusher.count
should never exceed 4. This allows scanners and compactions to proceed even in the presence of very high writes. hbase.regionserver.thread.compaction.large
andhbase.regionserver.thread.compaction.small
- The number of threads available to handle small and large compactions, respectively. Never increase either of these options to more than 50% of the number of disks available to HBase.
In addition to the above, if you use compression on some column families, more CPU will be used when flushing these column families to disk during flushes or compaction. The impact on CPU usage depends on the size of the flush or the amount of data to be decompressed and compressed during compactions.