Troubleshooting HBase
The Cloudera HBase packages have been configured to place logs in /var/log/hbase. Cloudera recommends tailing the .log files in this directory when you start HBase to check for any error messages or failures.
Table Creation Fails after Installing LZO
If you install LZO after starting the Region Server, you will not be able to create a table with LZO compression until you re-start the Region Server.
Why this happens
When the Region Server starts, it runs CompressionTest and caches the results. When you try to create a table with a given form of compression, it refers to those results. You have installed LZO since starting the Region Server, so the cached results, which pre-date LZO, cause the create to fail.
What to do
Restart the Region Server. Now table creation with LZO will succeed.
Thrift Server Crashes after Receiving Invalid Data
The Thrift server may crash if it receives a large amount of invalid data, due to a buffer overrun.
Why this happens
The Thrift server allocates memory to check the validity of data it receives. If it receives a large amount of invalid data, it may need to allocate more memory than is available. This is due to a limitation in the Thrift library itself.
What to do
<property> <name>hbase.regionserver.thrift.framed</name> <value>true</value> </property> <property> <name>hbase.regionserver.thrift.framed.max_frame_size_in_mb</name> <value>2</value> </property> <property> <name>hbase.regionserver.thrift.compact</name> <value>true</value> </property>
HBase is using more disk space than expected.
Location | Purpose | Troubleshooting Notes |
---|---|---|
/hbase/.snapshots | Contains one subdirectory per snapshot. | To list snapshots, use the HBase Shell command list_snapshots. To remove a snapshot, use delete_snapshot. |
/hbase/.archive | Contains data that would otherwise have been deleted (either because it was explicitly deleted or expired due to TTL or version limits on the table) but that is required to restore from an existing snapshot. | To free up space being taken up by excessive archives, delete the snapshots that refer to them. Snapshots never expire so data referred to by them is kept until the snapshot is removed. Do not remove anything from /hbase/.archive manually, or you will corrupt your snapshots. |
/hbase/.logs | Contains HBase WAL files that are required to recover regions in the event of a RegionServer failure. | WALs are removed when their contents are verified to have been written to StoreFiles. Do not remove them manually. If the size of any subdirectory of /hbase/.logs/ is growing, examine the HBase server logs to find the root cause for why WALs are not being processed correctly. |
/hbase/logs/.oldWALs | Contains HBase WAL files that have already been written to disk. A HBase maintenance thread removes them periodically based on a TTL. | To tune the length of time a WAL stays in the .oldWALs before it is removed, configure the hbase.master.logcleaner.ttl property, which defaults to 60000 milliseconds, or 1 hour. |
/hbase/.logs/.corrupt | Contains corrupted HBase WAL files. | Do not remove corrupt WALs manually. If the size of any subdirectory of /hbase/.logs/ is growing, examine the HBase server logs to find the root cause for why WALs are not being processed correctly. |