HBase region inconsistencies

This guide provides an overview of common HBase region inconsistencies, their symptoms, causes, and resolution steps using the HBCK2 tool.

Before attempting to resolve any inconsistencies, always run the following command to generate the list of issues:
hbase hbck -details > /tmp/hbck.txt

Also, inspect the HBase Master logs for any startup or region assignment issues, especially those containing unknown_server. These errors typically arise from RegionServer restarts with mismatched start codes. The following is an example.

Condition

unknown_server=server3.customer.com,16020,1632159024921/TABLE.NAME,...

Cause

This issue typically occurs when the RegionServer is restarted;; however, the Master is not yet registered with the new start code. This causes the Master to treat the newly started RegionServer as unknown.

Remedy

To fix the issue, run the following command:
hbase hbck -j <path-to-hbck2-jar> scheduleRecoveries '<HOSTNAME1>,<PORT>,<STARTCODE1>' '<HOSTNAME2>,<PORT>,<STARTCODE2>' ...

Before executing recovery commands, always validate the following:

  • hbase:meta table
  • HDFS paths
  • Master logs

After each fix, run the following command to monitor consistency.

hbase hbck -details > /tmp/hbck.txt