Managing HBase
Cloudera Manager requires certain additional steps to set up and configure the HBase service.
Creating the HBase Root Directory
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
When adding the HBase service, the Add Service wizard automatically creates a root directory for HBase in HDFS. If you quit the Add Service wizard or it does not finish, you can create the root directory outside the wizard by doing these steps:
- Choose Create Root Directory from the Actions menu in the HBase > Status tab.
- Click Create Root Directory again to confirm.
Graceful Shutdown
Minimum Required Role: Operator (also provided by Configurator, Cluster Administrator, Full Administrator)
A graceful shutdown of an HBase RegionServer allows the regions hosted by that RegionServer to be moved to other RegionServers before stopping the RegionServer. Cloudera Manager provides the following configuration options to perform a graceful shutdown of either an HBase RegionServer or the entire service.
To increase the speed of a rolling restart of the HBase service, set the Region Mover Threads property to a higher value. This increases the number of regions that can be moved in parallel, but places additional strain on the HMaster. In most cases, Region Mover Threads should be set to 5 or lower.
Gracefully Shutting Down an HBase RegionServer
- Go to the HBase service.
- Click the Instances tab.
- From the list of Role Instances, select the RegionServer you want to shut down gracefully.
- Select .
- Cloudera Manager attempts to gracefully shut down the RegionServer for the interval configured in the Graceful Shutdown Timeout configuration option, which defaults to 3 minutes. If the graceful shutdown fails, Cloudera Manager forcibly stops the process by sending a SIGKILL (kill -9) signal. HBase will perform recovery actions on regions that were on the forcibly stopped RegionServer.
- If you cancel the graceful shutdown before the Graceful Shutdown Timeout expires, you can still manually stop a RegionServer by selecting , which sends a SIGTERM (kill -5) signal.
Gracefully Shutting Down the HBase Service
- Go to the HBase service.
- Select . This tries to perform an HBase Master-driven graceful shutdown for the length of the configured Graceful Shutdown Timeout (three minutes by default), after which it abruptly shuts down the whole service.
Configuring the Graceful Shutdown Timeout Property
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
This timeout only affects a graceful shutdown of the entire HBase service, not individual RegionServers. Therefore, if you have a large cluster with many RegionServers, you should strongly consider increasing the timeout from its default of 180 seconds.
- Go to the HBase service.
- Click the Configuration tab.
- Select
- Use the Search box to search for the Graceful Shutdown Timeout property and edit the value.
- Click Save Changes to save this setting.
Configuring the HBase Thrift Server Role
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
- Go to the HBase service.
- Click the Instances tab.
- Click the Add Role Instances button.
- Select the host(s) where you want to add the Thrift Server role (you only need one for Hue) and click Continue. The Thrift Server role should appear in the instances list for the HBase server.
- Select the Thrift Server role instance.
- Select .
Enabling HBase Indexing
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
HBase indexing is dependent on the Key-Value Store Indexer service. The Key-Value Store Indexer service uses the Lily HBase Indexer Service to index the stream of records being added to HBase tables. Indexing allows you to query data stored in HBase with the Solr service.
- Go to the HBase service.
- Click the Configuration tab.
- Select
- Select .
- Select the Enable Replication and Enable Indexing properties.
- Click Save Changes.
Adding a Custom Coprocessor
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
- Select the HBase service.
- Click the Configuration tab.
- Select .
- Select .
- Type HBase Coprocessor in the Search box.
- You can configure the values of the following properties:
- HBase Coprocessor Abort on Error (Service-Wide)
- HBase Coprocessor Master Classes (Master Default Group)
- HBase Coprocessor Region Classes (RegionServer Default Group)
- Click Save Changes to commit the changes.
Enabling Hedged Reads on HBase
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
- Go to the HBase service.
- Click the Configuration tab.
- Select .
- Select .
- Configure the HDFS Hedged Read Threadpool Size and HDFS Hedged Read Delay Threshold properties. The descriptions for each of these properties on the configuration pages provide more information.
- Click Save Changes to commit the changes.
Advanced Configuration for Write-Heavy Workloads
- hbase.hstore.flusher.count
- The number of threads available to flush writes from memory to disk. Never increase hbase.hstore.flusher.count to more of 50% of the number of disks available to HBase. For example, if you have 8 solid-state drives (SSDs), hbase.hstore.flusher.count should never exceed 4. This allows scanners and compactions to proceed even in the presence of very high writes.
- hbase.regionserver.thread.compaction.large and hbase.regionserver.thread.compaction.small
- The number of threads available to handle small and large compactions, respectively. Never increase either of these options to more than 50% of the number of disks available to HBase.
In addition to the above, if you use compression on some column families, more CPU will be used when flushing these column families to disk during flushes or compaction. The impact on CPU usage depends on the size of the flush or the amount of data to be decompressed and compressed during compactions.