Configuration Settings for HBase

This section contains information on configuring the Linux host and HDFS for HBase.

Using DNS with HBase

HBase uses the local hostname to report its IP address. Both forward and reverse DNS resolving should work. If your server has multiple interfaces, HBase uses the interface that the primary hostname resolves to. If this is insufficient, you can set hbase.regionserver.dns.interface in the hbase-site.xml file to indicate the primary interface. To work properly, this setting requires that your cluster configuration is consistent and every host has the same network interface configuration. As an alternative, you can set hbase.regionserver.dns.nameserver in the hbase-site.xml file to use a different DNS name server than the system-wide default.

Using the Network Time Protocol (NTP) with HBase

The clocks on cluster members must be synchronized for your cluster to function correctly. Some skew is tolerable, but excessive skew could generate odd behaviors. Run NTP or another clock synchronization mechanism on your cluster. If you experience problems querying data or unusual cluster operations, verify the system time. For more information about NTP, see the NTP website.

Setting User Limits for HBase

Because HBase is a database, it opens many files at the same time. The default setting of 1024 for the maximum number of open files on most Unix-like systems is insufficient. Any significant amount of loading will result in failures and cause error message such as java.io.IOException...(Too many open files) to be logged in the HBase or HDFS log files. For more information about this issue, see the Apache HBase Book. You may also notice errors such as:

2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException
2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901

Another setting you should configure is the number of processes a user is permitted to start. The default number of processes is typically 1024. Consider raising this value if you experience OutOfMemoryException errors.

Configuring ulimit for HBase Using Cloudera Manager

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  1. Go to the HBase service.
  2. Click the Configuration tab.
  3. Select Scope > Master or Scope > RegionServer.
  4. Locate the Maximum Process File Descriptors property or search for it by typing its name in the Search box.
  5. Edit the property value.

    To apply this configuration property to other role groups as needed, edit the value for the appropriate role group. See Modifying Configuration Properties Using Cloudera Manager.

  6. Enter a Reason for change, and then click Save Changes to commit the changes.
  7. Restart the role.
  8. Restart the service.

Configuring ulimit for HBase Using the Command Line

Cloudera recommends increasing the maximum number of file handles to more than 10,000. Increasing the file handles for the user running the HBase process is an operating system configuration, not an HBase configuration. A common mistake is to increase the number of file handles for a particular user when HBase is running as a different user. HBase prints the ulimit it is using on the first line in the logs. Make sure that it is correct.

To change the maximum number of open files for a user, use the ulimit -n command while logged in as that user.

To set the maximum number of processes a user can start, use the ulimit -u command. You can also use the ulimit command to set many other limits. For more information, see the online documentation for your operating system, or the output of the man ulimit command.

To make the changes persistent, add the command to the user's Bash initialization file (typically ~/.bash_profile or ~/.bashrc ). Alternatively, you can configure the settings in the Pluggable Authentication Module (PAM) configuration files if your operating system uses PAM and includes the pam_limits.so shared library.

Configuring ulimit using Pluggable Authentication Modules Using the Command Line

If you are using ulimit, you must make the following configuration changes:
  1. In the /etc/security/limits.conf file, add the following lines, adjusting the values as appropriate. This assumes that your HDFS user is called hdfs and your HBase user is called hbase.
hdfs  -       nofile  32768
hdfs  -       nproc   2048
hbase -       nofile  32768
hbase -       nproc   2048

To apply the changes in /etc/security/limits.conf on Ubuntu and Debian systems, add the following line in the /etc/pam.d/common-session file:

session required  pam_limits.so

For more information on the ulimit command or per-user operating system limits, refer to the documentation for your operating system.

Using dfs.datanode.max.transfer.threads with HBase

A Hadoop HDFS DataNode has an upper bound on the number of files that it can serve at any one time. The upper bound is controlled by the dfs.datanode.max.transfer.threads property (the property is spelled in the code exactly as shown here). Before loading, make sure you have configured the value for dfs.datanode.max.transfer.threads in the conf/hdfs-site.xml file (by default found in /etc/hadoop/conf/hdfs-site.xml) to at least 4096 as shown below:

<property>
  <name>dfs.datanode.max.transfer.threads</name>
  <value>4096</value>
</property>

Restart HDFS after changing the value for dfs.datanode.max.transfer.threads. If the value is not set to an appropriate value, strange failures can occur and an error message about exceeding the number of transfer threads will be added to the DataNode logs. Other error messages about missing blocks are also logged, such as:

06/12/14 20:10:31 INFO hdfs.DFSClient: Could not obtain block blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node: 
java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry... 

Configuring BucketCache in HBase

The default BlockCache implementation in HBase is CombinedBlockCache, and the default off-heap BlockCache is BucketCache. SlabCache is now deprecated. See Configuring the HBase BlockCache for information about configuring the BlockCache using Cloudera Manager or the command line.

Configuring Encryption in HBase

It is possible to encrypt the HBase root directory within HDFS, using HDFS Transparent Encryption. This provides an additional layer of protection in case the HDFS filesystem is compromised.

If you use this feature in combination with bulk-loading of HFiles, you must configure hbase.bulkload.staging.dir to point to a location within the same encryption zone as the HBase root directory. Otherwise, you may encounter errors such as:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): /tmp/output/f/5237a8430561409bb641507f0c531448 can't be moved into an encryption zone.

You can also choose to only encrypt specific column families, which encrypts individual HFiles while leaving others unencrypted, using HBase Transparent Encryption at Rest. This provides a balance of data security and performance.

Enabling Hedged Reads for HBase

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  1. Go to the HBase service.
  2. Click the Configuration tab.
  3. Select Scope > HBASE-1 (Service-Wide).
  4. Select Category > Performance.
  5. Configure the HDFS Hedged Read Threadpool Size and HDFS Hedged Read Delay Threshold properties. The descriptions for each of these properties on the configuration pages provide more information.
  6. Enter a Reason for change, and then click Save Changes to commit the changes.

Accessing HBase by using the HBase Shell

After you have started HBase, you can access the database in an interactive way by using the HBase Shell, which is a command interpreter for HBase which is written in Ruby. Always run HBase administrative commands such as the HBase Shell, hbck, or bulk-load commands as the HBase user (typically hbase).

$ hbase shell

HBase Shell Overview

  • To get help and to see all available commands, use the help command.
  • To get help on a specific command, use help "command". For example:
    hbase> help "create"
  • To remove an attribute from a table or column family or reset it to its default value, set its value to nil. For example, use the following command to remove the KEEP_DELETED_CELLS attribute from the f1 column of the users table:
    hbase> alter 'users', { NAME => 'f1', KEEP_DELETED_CELLS => nil }
  • To exit the HBase Shell, type quit.

Setting Virtual Machine Options for HBase Shell

HBase in CDH 5.2 and higher allows you to set variables for the virtual machine running HBase Shell, by using the HBASE_SHELL_OPTS environment variable. This example sets several options in the virtual machine.

$ HBASE_SHELL_OPTS="-verbose:gc -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps
      -XX:+PrintGCDetails -Xloggc:$HBASE_HOME/logs/gc-hbase.log" ./bin/hbase shell

Scripting with HBase Shell

CDH 5.2 and higher include non-interactive mode. This mode allows you to use HBase Shell in scripts, and allow the script to access the exit status of the HBase Shell commands. To invoke non-interactive mode, use the -n or --non-interactive switch. This small example script shows how to use HBase Shell in a Bash script.

#!/bin/bash
echo 'list' | hbase shell -n
status=$?
if [ $status -ne 0 ]; then
  echo "The command may have failed."
fi

Successful HBase Shell commands return an exit status of 0. However, an exit status other than 0 does not necessarily indicate a failure, but should be interpreted as unknown. For example, a command may succeed, but while waiting for the response, the client may lose connectivity. In that case, the client has no way to know the outcome of the command. In the case of a non-zero exit status, your script should check to be sure the command actually failed before taking further action.

CDH 5.7 and higher include the get_splits command, which returns the split points for a given table:
hbase> get_splits 't2'
Total number of splits = 5

=> ["", "10", "20", "30", "40"]

You can also write Ruby scripts for use with HBase Shell. Example Ruby scripts are included in the hbase-examples/src/main/ruby/ directory.

HBase Online Merge

CDH 6 supports online merging of regions. HBase splits big regions automatically but does not support merging small regions automatically. To complete an online merge of two regions of a table, use the HBase shell to issue the online merge command. By default, both regions to be merged should be neighbors; that is, one end key of a region should be the start key of the other region. Although you can "force merge" any two regions of the same table, this can create overlaps and is not recommended.

The Master and RegionServer both participate in online merges. When the request to merge is sent to the Master, the Master moves the regions to be merged to the same RegionServer, usually the one where the region with the higher load resides. The Master then requests the RegionServer to merge the two regions. The RegionServer processes this request locally. Once the two regions are merged, the new region will be online and available for server requests, and the old regions are taken offline.

For merging two consecutive regions use the following command:

hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME'

For merging regions that are not adjacent, passing true as the third parameter forces the merge.

hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME', true

Configuring RegionServer Grouping

You can use RegionServer Grouping (rsgroup) to impose strict isolation between RegionServers by partitioning RegionServers into distinct groups. You can use HBase Shell commands to define and manage RegionServer Grouping.

You must first create an rsgroup before you can add RegionServers to it. Once you have created an rsgroup, you can move your HBase tables into this rsgroup so that only the RegionServers in the same rsgroup can host the regions of the table.

A custom balancer implementation tracks assignments per rsgroup and moves regions to the relevant RegionServers in that rsgroup. The rsgroup information is stored in a regular HBase table, and a ZooKeeper-based read-only cache is used at cluster bootstrap time.

Enabling RegionServer Grouping using Cloudera Manager

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

You must use Cloudera Manager to enable RegionServer Grouping before you can define and manage rsgroups.

  1. Go to the HBase service.
  2. Click the Configuration tab.
  3. Select Scope > Master.
  4. Locate the HBase Coprocessor Master Classes property or search for it by typing its name in the Search box.
  5. Add the following property value: org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.
  6. Locate the Master Advanced Configuration Snippet (Safety Valve) for hbase-site.xml property or search for it by typing its name in the Search box.
  7. Click View as XML and add the following property:
    <property>   
      <name>hbase.master.loadbalancer.class</name>
      <value>org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer</value>
    </property>
  8. Enter a Reason for change, and then click Save Changes to commit the changes.
  9. Restart the role.
  10. Restart the service.

Configuring RegionServer Grouping

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

When you add a new rsgroup, you are creating an rsgroup other than the default group. To configure a rsgroup, in the HBase shell:

  1. Add an rsgroup:
    hbase> add_rsgroup 'mygroup'
  2. Add RegionServers and tables to this rsgroup:
    hbase> move_servers_tables_rsgroup ‘mygroup’, ['server1:port','server2:port'],['table1','table2']
  3. Run the balance_rsgroup command if the tables are slow to migrate to the group’s dedicated server.

Monitor RegionServer Grouping

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

You can monitor the status of the commands using the Tables tab on the HBase Master UI home page. If you click on a table name, you can see the RegionServers that are deployed.

You must manually align the RegionServers referenced in rsgroups with the actual state of nodes in the cluster that is active and running.

Removing a RegionServer from RegionServer Grouping

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

You can remove a RegionServer by moving it to the default rsgroup. Edits made using shell commands to all rsgroups, except the default rsgroup, are persisted to the system hbase:rsgroup table. If an rsgroup references a decommissioned RegionServer, then the rsgroup should be updated to undo the reference.

  1. Move the RegionServer to the default rsgroup using the command:
    hbase> move_servers_rsgroup 'default',['server1:port']
  2. Check the list of RegionServers in your rsgourp to ensure that that the RegionServer is successfully removed using the command:
    hbase> get_rsgroup 'mygroup

The default rsgroup’s RegionServer list mirrors the current state of the cluster. If you shut down a RegionServer that was part of the default rsgroup, and then run the get_rsgroup 'default' command to list its content in the shell, the server is no longer listed. If you move the offline server from the non-default rsgroup to default, it will not show in the default list; the server will just be removed from the list.

Enabling ACL for RegionServer Grouping

Minimum Required Role: Full Administrator

You need to be a Global Admin to manage rsgroups if authorization is enabled.

To enable ACL, add the following to the hbase-site.xml file, and then restart your HBase Master server:

<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
<property>

Best Practices when using RegionServer Grouping

You must keep in mind the following best practices when using rsgroups:

Isolate System Tables

You can either have a system rsgroup where all the system tables are present or just leave the system tables in default rsgroup and have all user-space tables in non-default rsgroups.

Handling Dead Nodes

You can have a special rsgroup of dead or questionable nodes to help you keep them without running until the nodes are repaired. Be careful when replacing dead nodes in an rsgroup, and ensure there are enough live nodes before you start moving out the dead nodes. You can move the good live nodes first before moving out the dead nodes.

If you have configured a table to be in a rsgroup, but all the RegionServers in that rsgroup die, the tables become unavailable and you can no longer access those tables.

Troubleshooting RegionServer Grouping

If you encounter an issue when using rsgroup, check the Master log to see what is causing the issue. If an rsgroup operation is unresponsive, restart the Master.

For example, if you have not enabled the rsgroup coprocessor endpoint in the Master, and run any of the rsgroup shell commands, you will encounter the following error message:

ERROR: org.apache.hadoop.hbase.exceptions.UnknownProtocolException: No registered master coprocessor service found for name RSGroupAdminService
    at org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:604)
    at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1140)
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
    at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)
    at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:257)

Disabling RegionServer Grouping

When you no longer require rsgroups, you can disable it for your cluster. Removing RegionServer Grouping for a cluster on which it was enabled involves more steps in addition to removing the relevant properties from hbase-site.xml. You must ensure that you clean the RegionServer grouping-related metadata so that if the feature is re-enabled in the future, the old metadata will not affect the functioning of the cluster.

To disable RegionServer Grouping:

  1. Move all the tables in non-default rsgroups to default RegionServer group.
    #Reassigning table t1 from the non-default group - hbase shell
    hbase> move_tables_rsgroup 'default',['t1']
  2. Move all RegionServers in non-default rsgroups to default regionserver group.
    #Reassigning all the servers in the non-default rsgroup to default - hbase shell
    hbase> move_servers_rsgroup 'default',['regionserver1:port','regionserver2:port','regionserver3:port']
  3. Remove all non-default rsgroups. default rsgroup created implicitly does not have to be removed.
    #removing non-default rsgroup - hbase shell
    hbase> remove_rsgroup 'mygroup'
  4. Remove the changes made in hbase-site.xml and restart the cluster.
  5. Drop the table hbase:rsgroup from HBase.
    #Through hbase shell drop table hbase:rsgroup
    hbase> disable 'hbase:rsgroup'
    0 row(s) in 2.6270 seconds
    hbase> drop 'hbase:rsgroup'
    0 row(s) in 1.2730 seconds
    
  6. Remove the znode rsgroup from the cluster ZooKeeper using zkCli.sh.
    #From ZK remove the node /hbase/rsgroup through zkCli.sh
    rmr /hbase/rsgroup

Troubleshooting HBase

Configuring the BlockCache

Configuring the Scanner Heartbeat