Fixed Issues in HBase

Review the list of HBase issues that are resolved in Cloudera Runtime 7.2.17.

CDPD-50683: HBase: Add idempotency support for fs.rename while using ABFS
HBaseAzureSemantics now supports resilient rename for files between source and destination on Azure Blob storage. However, HBase does not support any directory renaming and the behavior is same as ABFS.
Resilient rename is a workaround for RenamePathFile 500/OAuthServerTimeoutError and RenamePathFile 404/OAuthClientOtherError (Source File does not exist after a successfully rename) handled by the file system HBaseAzureSemantics. With this workaround, the region server must not crash during the commit stage of the flush operation for HFile.
CDPD-48344: Make the initial corePoolSize configurable for ChoreService
Add hbase.choreservice.initial.pool.size configuration property to set the initial number of threads for the ChoreService.
CDPD-48185: Hbase_mcc - Upgrade Scala to 2.12.5+/2.13.9+ due to CVE-2017-15288
The Scala version depends on the spark-scala version.
CDPD-46164: HBASE Tarball does not contain JWT related JARS
JWT related JARS added to the HBASE Tarball.
CDPD-46065: Job to sync schema differences between the HA clusters
A new tool SyncTableDescriptorsTool which synchronizes the table descriptors between two peers that uses the HBase multiple cluster client after one of the primary or the failover cluster disconnected from the two-way peer replication, especially, if the operator finds any table attributes that are different on the clusters.
To use this tool:
  1. Get the classpath including hbase-mcc.jar, HBase's jars, and hbase-site.xml from the source cluster.
  2. Use the following command:

    java -cp "hbase-mcc-0.2.0-SNAPSHOT.jar:/path/to/clusters/:/opt/cloudera/parcels/CDH/jars/*:`hbase classpath`" com.cloudera.hbase.mcc.SyncTableDescriptorsTool -t <table that needs to sync for table descriptor>

Apache Patch Information

  • HBASE-27565
HBase new features:
  • HBASE-27347: Port FileWatcher from ZK to autodetect keystore/truststore changes in TLS connections
  • HBASE-27346: Autodetect key/truststore file type from file extension
  • HBASE-27280 Add mutual authentication support to TLS (#4796)
  • HBASE-27565 Make the initial corePoolSize configurable for ChoreService (#4958)
  • HBASE-27506 Optionally disable sorting directories by size in CleanerChore (#4896)
HBase improvements:
  • HBASE-27713 Remove numRegions in Region Metrics (#5107)
  • HBASE-27726 Handling of ruby shell SyntaxError exceptions (#5147)
  • HBASE-26526 Introduce a timeout to shutdown of WAL (#3297)
  • HBASE-27684: add client metrics related to user region lock. (#5081) (#5133)
  • HBASE-27646 Should not use pread when prefetching in HFilePreadReader (#5063) (#5123)
  • HBASE-27710 ByteBuff ref counting is too expensive for on-heap buffers (#5115)
  • HBASE-27670 Improve FSUtils to directly obtain FSDataOutputStream (#5064)
  • HBASE-27458 Use ReadWriteLock for region scanner readpoint map (#5068)
  • HBASE-15242: add client side metrics for timeout and remote exceptions. (#5023) (#5054)
  • HBASE-21521 Expose master startup status via web UI (#4788) (#5021)
  • HBASE-27626 Suppress noisy logging in client.ConnectionImplementation (#5019)
  • HBASE-27551 Add config options to delay assignment to retain last region location (#4945)
  • HBASE-27556 Reuse Zookeeper session of Master in LogCleaner (#4946)
  • HBASE-27474 Evict blocks on split/merge; Avoid caching reference/hlinks if compaction is enabled (#4868)
HBase bugfixes:
  • HBASE-27820: HBase is not starting due to Jersey library conflicts wi… (#5210) (#5261)
  • HBASE-27752: Update the list of prefetched files upon region movement (#5194) (#5222)
  • HBASE-27810. Check if event processor is already shut down (#5212)
  • HBASE-27768 Race conditions in BlockingRpcConnection (#5154)
  • HBASE-27445 fix the result of DirectMemoryUtils#getDirectMemorySize (#4846)
  • HBASE-27778 Incorrect ReplicationSourceWALReader.totalBufferUsed may cause replication hang up (#5163)
  • HBASE-27704 Quotas can drastically overflow configured limit (#5153)
  • HBASE-27758 Inconsistent synchronization in MetricsUserSourceImpl (#5149)
  • HBASE-27750: Update the list of prefetched Hfiles upon block eviction (#5140)
  • HBASE-26866 Shutdown WAL may abort region server (#4254)
  • HBASE-27676 Scan handlers in the RPC executor should match at least one scan queues (#5074)
  • HBASE-27736 HFileSystem.getLocalFs is not used (#5125)
  • HBASE-27671 Client should not be able to restore/clone a snapshot after it has TTL expired it's TTL has expired (#5118)
  • HBASE-27651 hbase-daemon.sh foreground_start should propagate SIGHUP and SIGTERM
  • HBASE-27718 The regionStateNode only need remove once in regionOffline (#5106)
  • HBASE-27714 WALEntryStreamTestBase creates a new HBTU in startCluster method which causes all sub classes are testing default configurations (#5101)
  • HBASE-27686: Recovery of BucketCache and Prefetched data after RS Crash (#5080)
  • HBASE-27688 HFile splitting occurs during bulkload, the CREATE_TIME_TS of hfileinfo is 0 (#5097)
  • HBASE-27673 Fix mTLS client hostname verification (#5065)
  • HBASE-24781 Clean up peer metrics when disabling peer (#4997)
  • HBASE-27650 Merging empty regions corrupts meta cache (branch-2) (#5038)
  • HBASE-27668 PB's parseDelimitedFrom can successfully return when there are not enough bytes (#5059)
  • HBASE-27644 Should not return false when WALKey has no following KVs while reading WAL file (#5032)
  • HBASE-27661 Set size of systable queue in UT (#5053)
  • HBASE-27636 The "CREATE_TIME_TS" value of the hfile generated by the HFileOutputFormat2 class is 0 (#5034)
  • HBASE-27648 CopyOnWriteArrayMap does not honor contract of ConcurrentMap.putIfAbsent (#5031)
  • HBASE-27643 [JDK17] Add-opens java.util.concurrent (#5028)
  • HBASE-27619 Bulkload fails when trying to bulkload files with invalid names after HBASE-26707 (#5014)
  • HBASE-27590 Change Iterable to List in SnapshotFileCache (#4995)
  • HBASE-27621 Also clear the Dictionary when resetting when reading compressed WAL file (#5016)
  • HBASE-27602 Remove the impact of operating env on testHFileCleaning (#5003)
  • HBASE-27580 Reverse scan over rows with tags throw exceptions when using DataBlockEncoding (#5006)
  • HBASE-26967 FilterList with FuzzyRowFilter and SingleColumnValueFilter evaluated with operator MUST_PASS_ONE doesn't work as expected(#4820)
  • HBASE-27547 Close store file readers after region warmup (#4942)
  • HBASE-25516 [JDK17] reflective access Field.class.getDeclaredField("modifiers") not supported (#3443)
  • HBASE-27579 CatalogJanitor can cause data loss due to errors during cleanMergeRegion (#4986)
  • HBASE-27561 hbase.master.port is ignored in processing of hbase.masters (#4952)
  • HBASE-27529 Provide RS coproc ability to attach WAL extended attributes to mutations at replication sink (#4924)
  • HBASE-27560 fix consistencyCheck did not report the hole on last region (#4950)
  • HBASE-27540 add client side counter metrics for failed rpc calls (#4929)
  • HBASE-27524 Fix python requirements problem (#4918)
  • HBASE-27519 Another case for FNFE on StoreFileScanner after a flush followed by a compaction (#4923)
  • HBASE-27498: Added logic in ConnectionImplementation.getKeepAliveMasterService to avoid expensive rpc calls in synchronized block (#4889)
  • HBASE-27491 Do not clear cache on RejectedExecutionException (#4914)
  • HBASE-27487 Addendum remove unused imports
  • HBASE-27487: Slow meta can create pathological feedback loop with multigets (#4900)
  • HBASE-27494: Fix missing meta cache dropping exception metrics (#4902)
  • HBASE-27484 FNFE on StoreFileScanner after a flush followed by a compaction (#4882)
  • HBASE-27503 Support replace <FILE-PATH> in GC_OPTS for ZGC (#4892)
  • HBASE-26320 Implement a separate thread pool for the LogCleaner (#4895)
  • HBASE-27463 Reset sizeOfLogQueue when refresh replication source (#4863)
  • HBASE-27504 Remove duplicated config 'hbase.normalizer.merge.min_region_age.days' in hbase-default.xml (#4894)
  • HBASE-27464 In memory compaction 'COMPACT' may cause data corruption when adding cells large than maxAlloc(default 256k) size (#4881)
  • HBASE-27043 Let lock wait timeout to improve performance of SnapshotHFileCleaner (#4437)
  • HBASE-25899 Improve efficiency of SnapshotHFileCleaner (#3280)
  • HBASE-25363 Improve performance of HFileLinkCleaner by using ReadWriteLock instead of synchronize
  • HBASE-27379 fix numOpenConnections metric is one less than the actual (#4884)
  • HBASE-27001 The deleted variable cannot be printed out (#4883)
  • HBASE-27469 IllegalArgumentException is thrown by SnapshotScannerHDFSAclController when dropping a table (#4865)
  • HBASE-27414 Adjust hfilelink alternative paths order (#4847)
  • HBASE-27450 Update all our python scripts to use python3 (#4851)
  • HBASE-27440 fix table HistogramMetrics leak in table metrics map (#4838)
  • HBASE-27159 Emit source metrics for BlockCacheExpressHitPercent (#4830)
  • HBASE-27339 Improve sasl connection failure log message to include server (#4823)
  • HBASE-27401 Addendum remove unused imports
  • HBASE-27406 Make /prometheus endpoint accessible from HBase UI (#4833)
HBase connectors bugfixes:
  • HBASE-27630: hbase-spark bulkload stage directory limited to hdfs only (#108)
  • HBASE-27624 Cannot Specify Namespace via the hbase.table Option in Spark Connector (#107)
  • HBASE-27397 Spark-hbase support for 'startWith' predicate (#105)

Technical Service Bulletins

TSB 2023-667: HBase snapshot export failure can lead to data loss
For the latest update on this issue see the corresponding Knowledge article: TSB 2023-667: HBase snapshot export failure can lead to data loss.