Configuration details

Know the additional configuration details related to the HBase BucketCache configuration.

Following are the two different options for configuring the persistent BucketCache:

  • Use the retain assignment properties: This option forces the regions assigned on the restarted RegionServer to wait until these RegionServers complete initialization so that they can be reassigned to the same servers, and can leverage the persistent cache. This option has the disadvantage of keeping some regions unavailable for a certain period or until the RegionServers complete initialization.
  • Use the cache-aware load balancer: This option considers cache allocations when deciding on a new assignment plan. The disadvantage is that there might be a temporary performance degradation for read requests on a minor portion of the dataset, as some regions might still be moved to different RegionServers.

You must consider your use case requirements when choosing between these options. The retain properties option becomes more suitable if a consistent SLA is preferred over availability. If availability is critical and temporary performance deviations can be tolerated, the cache-aware load balancer option is preferred.

Using the retain assignment properties

Learn how to configure the retain assignment properties.

This configuration might delay the region assignment until the given RegionServer reports itself as online to the master. The region in transition warning may be reported on the master UI or by the HBCK tool.

  1. Log in to the Cloudera Manager as an Administrator.
  2. Select the HBase service.
  3. Go to Configuration > Advanced > HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml.
  4. Set the values of hbase.master.scp.retain.assignment and hbase.master.scp.retain.assignment.force as true.
The amount of time the master region assignment background process tries to open the region on the given RegionServer is determined by the hbase.master.scp.retain.assignment.force.retries property (default value 600). Between each retry, the master region assignment background process sleeps for an exponential factor of the value defined in the hbase.master.scp.retain.assignment.force.wait-interval property (default value 50) in milliseconds.

Using the cache-aware load balancer

Learn how to configure the HBase cache-aware load balancer.

This option uses the HBase cache-aware balancer implementation that considers the cache allocation when defining a new region assignment. This avoids the assignment delays observed while using the retain assignment option.

  1. Log in to the Cloudera Manager as an Administrator.
  2. Select the HBase service.
  3. Go to Configuration > Advanced > HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml.
  4. Set the value of hbase.master.loadbalancer.class as org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer to enable the HBase cache-aware load balancer.
    For more information, see HBase cache-aware load balancer configuration.

Validating the expected behavior

Ensure to validate the BucketCache functionality using Cloudera Manager.

After you implement the persistent BucketCache functionality, the restart of RegionServers must have a little impact on the cache allocation.

  1. Log in to the Cloudera Manager as an Administrator.
  2. Validate the BucketCache functionality.
    In scenarios, where the dataset is fully cached, the Cloudera Manager chart for the block cache, as shown in the following example, shows a flat line spanning over restart periods (assuming either the additional configuration option Using the retain assignment properties or Using the cache-aware load balancer is also applied).