Review the list of HBase issues that are resolved in Cloudera Runtime 7.3.1, its service packs and cumulative hotfixes.
Cloudera Runtime 7.3.1.500 SP3
There are no fixed issues in this release.
Cloudera Runtime 7.3.1.400 SP2
- CDPD-84435: The upgrade operation fails with a message
“Failed to decommission RegionServer.”
- 7.3.1.400
- This issue is fixed. Now, the HBase MASTER aborts when
it detects a WALSyncTimeoutException while making edits
to the MasterRegion.
Apache Jira: HBASE-28803
- CDPD-83544 : Potential performance degradation may
occur when utilizing persistent cache, causing a restart involving full
cache recovery.
- 7.3.1.400
- This issue is fixed.
Apache Jira: HBASE-29326
- CDPD-81524: Add configurable throttling of region
moves in CacheAwareLoadBalancer
- 7.3.1.400
- This fix introduces region moving throttling for
LoadBalancer implementations. The throttling time is configurable by the
hbase.master.balancer.move.throttlingMillis
property, with a default value of 60000 milliseconds.
In this change, the
only balancer implementation applying throttling is the
CacheAwareLoadBalancer. All other balancers just inherit the noop
default provided within the LoadBalancer interface.
The
CacheAwareLoadBalancer throttling implementation performs throttling
only for regions moving to the target server with a region cached ratio
below the threshold configurable by
hbase.master.balancer.stochastic.throttling.cacheRatio
(80% by default).
Apache Jira: HBASE-29168
- CDPD-81524: The `ENCODED_DATA` block type is not being
considered within `BucketCache.notifyFileCachingComplete`
- 7.3.1.400
- This fix addresses a defect in
BucketCache.notifyFileCachingComplete, wherein only blocks of the
DATA type were registered. When an encoding such as
FASTDIFF was employed, the data block type became
ENCODED_DATA, preventing it from being accounted for in
the internal cache metrics. This oversight subsequently affects the
cache-aware balancer after cache recovery following a crash or restart (with
persistent cache enabled), as the region percentage cache is not accurately
calculated due to this flaw.Apache Jira: HBASE-29243
- CDPD-81524: Enable BlockCache implementations to
define dynamic properties
- 7.3.1.400
- This resolution introduces dynamic configurability for
the following properties related to free space management and block
prioritization:
- hbase.bucketcache.acceptfactor
- hbase.bucketcache.minfactor
- hbase.bucketcache.extrafreefactor
- hbase.bucketcache.single.factor
- hbase.bucketcache.multi.factor
- hbase.bucketcache.multi.factor
- hbase.bucketcache.memory.factor
- hbase.bucketcache.queue.addition.waittime
- hbase.bucketcache.persist.intervalinmillis
- hbase.bucketcache.persistence.chunksize
Apache Jira: HBASE-29249
- CDPD-81524: Display hit ratio metrics by configurable,
granular periods
- 7.3.1.400
- This change introduces two additional properties:
- hbase.blockcache.stats.periods which allows
defining a multiple window period;
- hbase.blockcache.stats.period.minute which
defines the length of each of these periods (in minutes);
If hbase.blockcache.stats.periods is defined
and is greater than one, it creates a scheduled executor that rolls the
metrics calculation at
hbase.blockcache.stats.period.minute rate. This
property calculates the hit ratio for each of the last periods (as
defined by hbase.blockcache.stats.periods),
accounting for only the hits and requests that occurred during the
interval of the given period (as defined by
hbase.blockcache.stats.period.minute).
Apache
Jira: HBASE-29276
- CDPD-81524: Avoid adding new blocks during prefetch if
usage is greater than the accept factor
- 7.3.1.400
- Previously, when cache prefetch was enabled and cache
usage reached the configured acceptance factor, it resulted in a cycle of
frequent mass block evictions until the prefetch thread completed reading
the entire file. This process proved to be both costly and inefficient. An
initial attempt to mitigate this issue was proposed in HBASE-28176; however, that solution only
interrupted the prefetch thread after it had already attempted to cache the
current block being read, which could still trigger a mass eviction.
To
completely avert evictions triggered solely by the prefetch, this
modification evaluates the impact of incorporating the current block
into the cache before attempting to write it into the cache. This
verification is exclusively executed when caching from prefetch threads;
standard client reads and HFile writes persist in their attempt to cache
the associated block.
Apache Jira: HBASE-29288
Cloudera Runtime 7.3.1.300 SP1 CHF 1
There are no fixed issues in this release.
Cloudera Runtime 7.3.1.200 SP1
- CDPD-77399: HBase fails to register the servlet
metrics and throws ClassNotFoundException:
org.apache.hadoop.metrics.MetricsServlet
- 7.3.1.200
- This issue is fixed now. HBase does not warn about the
Hadoop 2-based metric servlet class on a Hadoop 3 deployment.
Apache Jira:: HBASE-28315
Cloudera Runtime 7.3.1.100 CHF 1
There are no fixed issues in this release.
Cloudera Runtime 7.3.1
- CDPD-67520: JWT authentication expects [sub] claim in
the payload
- 7.3.1
-
A JWT payload can have a custom claim for
Subject/Principal instead of the standard
sub claim.
You can set the
hbase.security.oauth.jwt.token.principal.claim
configuration property in Cloudera Manager under HBase
Service Advanced Configuration Snippet (Safety Valve) for
hbase-site.xml to define the custom
Subject/Principal claim.
- CDPD-66387: RegionServer should be aborted when
WAL.sync throws TimeoutIOException
- 7.3.1
- This fix adds additional logic for WAL.sync. If WAL.sync gets a timeout
exception, HBase wraps TimeoutIOException as a special
WALSyncTimeoutIOException. When the upper layer such as
HRegion.doMiniBatchMutate called by HRegion.batchMutation catches this
special exception, HBase aborts the region server.
Apache Jira:
HBASE-27230
- CDPD-65373: Make delay prefetch property dynamically
configurable
- 7.3.1
- This change allows you to dynamically configure
the hbase.hfile.prefetch.delay property using the
Cloudera Manager. You need to update the value and refresh the HBase
service. The new value is applied to the HBase service
automatically.
Apache Jira:
HBASE-28292
- CDPD-74494: JVM crashes intermittently on ARM64
machines
- 7.3.1
- After noticing the JVM crashes in the HBase
service that is based on arm64 architecture and uses JDK 17, the fix is
applied that refactors the module and the large implementation function into
multiple smaller functions. The issue was observed in a specific module that
had a very large member function.
Apache Jira:
HBASE-28206
- CDPD-73117: Bucket cache utilization is dropped after
a rolling restart
- 7.3.1
- For a persistent bucket cache of a size higher
than 1.3 TB, the corresponding backing-map information (information related
to the persistence cache) grows beyond 2 GB. But, 2 GB is the limit of the
protobuf message sizes. These protobuf messages are used to persist the
backing map information. If the size of the message grew beyond 2 GB, the
backing map partially persisted and after a restart, the size of the cache
seemed to be reduced.
With this fix, backing map information was chunked
in smaller chunks with sizes below 2 GB. Now all information, even
beyond 2 GB, is persisted and can be retrieved back after a rolling
restart.
- OPSAPS-70946: The hbase-site.xml file does not contain
xinclude for the refreshable files
- 7.3.1
- HBase supports generating
hbase-site.xml with
xinclude which
is needed for the hbase-site-refreshable.xml file.
- OPSAPS-70908: Refresh cluster command fails during
ephemeral cache zero downtime upgrade
- 7.3.1
- Configurations from refreshable files
encountered authentication failure during the refresh command when Kerberos
is enabled.
hbase/hbase.sh
["refresh-regionserver","hbase.hfile.prefetch.delay","hbase.rs.cacheblocksonwrite",
"hbase.block.data.cacheonread","hbase.rs.evictblocksonclose"]
To fix this, RegionServerRefreshCommand now sets
SCM_KERBEROS_PRINCIPAL as the Kerberos principal in
the region server refresh process in the environment.
- OPSAPS-70866: Invalid HBase prefetch configurations
during rolling runtime upgrade
- 7.3.1
- The default values of
hbase_hfile_prefetch_delay and
hbase.block.data.cacheonread are reverted to 1000
ms and are set to true.
- OPSAPS-70294: HBase must use load balancing for the
WEBHBASE Knox service
- 7.3.1
- For CDPD 7.3.0 and later, the WEBHBASE service
is configured for sticky load balancing instead of high availability in
Knox.
- OPSAPS-70035: HBase ZooKeeper client TLS toggle should
also control the daemon roles
- 7.3.1
- This issue is fixed. HBase ZooKeeper secure
client mode now affects all roles.
- OPSAPS-69983: Set Zookeeper store types to HBase
service configuration
- 7.3.1
- HBase now automatically sets the ZooKeeper
truststore type based on ScmParams.
- OPSAPS-69805: HBase client configuration does not use
a secure port if Client TLS is enabled
- 7.3.1
- HBase only uses a secure ZooKeeper port in
client connections if enabled explicitly.
- OPSAPS-69757: Make HBase TLS connection to ZooKeeper
disabled by default
- 7.3.1
- The HBase TLS connection to ZooKeeper must be
disabled because it breaks some use cases. Instead, HBase introduces a new
property to enable or disable in client roles. The default value is
disabled.
- OPSAPS-57937: No alerts are generated when the HBase
process is in a hung state
- 7.3.1
- HBase master monitoring (canary) showed green
status even if the master has not initialized yet and added extra checks to
query HBase if it is up and running.
- OPSAPS-53851: ZooKeeper SSL/TLS support for HBase
- 7.3.1
- Cloudera Manager configures HBase for a secure
ZooKeeper connection if ZooKeeper TLS is enabled.
- CDPD-74725: HBase throws
org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector
exception
- 7.3.1
- HBase direct memory buffer leak issues are
fixed which could lead to heap issues in the long run.
Apache
Jiras:
HBASE-28890 and HBASE-28893
- CDPD-72120: Allow specifying a filter for the REST
multiget endpoint (addendum: add back SCAN_FILTER constant)
- 7.3.1
- HBase allows specifying a filter for the REST
multiget endpoint (addendum: add back SCAN_FILTER constant).
Apache
Jira:
HBASE-28518
- CDPD-71008: REST Java client library assumes stateless
servers
- 7.3.1
- This issue is fixed.
Apache Jira:
HBASE-28500
- CDPD-71007: hbase-rest client shading conflicts with
hbase-shaded-client in HBase 2.x
- 7.3.1
- This issue is fixed.
Apache Jira:
HBASE-28526
- CDPD-71006: Support non-SPNEGO authentication methods
and implement session handling in the REST Java client library
- 7.3.1
- This issue is fixed.
Apache Jira:
HBASE-28501
- CDPD-70493: MultiRowRangeFilter deserialization fails
in org.apache.hadoop.hbase.rest.model.ScannerModel
- 7.3.1
- This issue is fixed.
Apache Jira:
HBASE-28626
- CDPD-69335: Use a single GET call in the REST multiget
endpoint
- 7.3.1
- This issue is fixed.
Apache Jira:
HBASE-28523
- CDPD-68900: HBase properties need to be dynamically
configured
- 7.3.1
- The following configurations can be dynamically configured.
- hbase.rs.evictblocksonclose
- hbase.rs.cacheblocksonwrite
- hbase.block.data.cacheonread
After changing values of these confgurations restarting region
servers is no longer required. These configurations help in getting
better throughput.
Newly changed values in the hbase-site.xml are
read by HBase and values in appropriate classes are updated.
- CDPD-68550: BucketCache.notifyFileCachingCompleted
might incorrectly consider a file fully cached
- 7.3.1
- This issue is fixed.
Apache Jira:
HBASE-28458
- CDPD-68154: BuckeCache.evictBlocksByHfileName does not
work after a cache recovery from a file
- 7.3.1
- This issue is fixed.
Apache Jira:
HBASE-28450
- CDPD-64046: BucketCache.blocksByHFile might leak on
allocationFailure or if encountering input/output errors can lead to cache
leak and extra heap usage
- 7.3.1
- This issue is fixed.
Apache Jira:
HBASE-28211
- CDPD-63765: Move the NavigableSet add operation to the
writer thread in BucketCache
- 7.3.1
- This issue fixes potential cache leaks and
extra memory usage.
Apache Jira:
HBASE-26305
- CDPD-62737: PrefetchExecutor must not run for files
from the CF levels that have disabled BLOCKCACHE
- 7.3.1
- This fix allows disabling the caching or
pre-caching of individual tables.
Apache Jira:
HBASE-28217
- CDPD-45890: Fix the miss count in one of the
CombinedBlockCache getBlock implementations
- 7.3.1
- This fix impacts the hit ratio chart's accuracy
in Cloudera Manager.
Apache Jira:
HBASE-28189