Review the list of fixed issues for Kudu in Cloudera Runtime 7.1.7 SP2.
- CDPD-47068: Updated default value for --tablet_history_max_age_sec to avoid OOM for
kudu-master
- Fixed an issue with the kudu-master process consuming too much memory in case of very
large clusters, clusters with many thousands of tables, or clusters with huge numbers of
DDL operations per day.
- CDPD-46131: Fixed table creation with HMS Integration
- The issue manifested itself when Kudu HMS integration was enabled and a table was
created through a "stored as kudu table" query on Impala from Hive. Any subsequent query
through Hive failed with a ClassNotFoundError as the Kudu HMS client would not send Hive
some necessary fields in the create table request.
- CDPD-45355: Fixed multiple DNS related issues
- One of these issues revolves around a change of addresses at runtime and is fixed by
refreshing DNS entries if proxies hit a network error. Another issue is fixed by allowing
the reuse of outbound request buffers when retrying.
- CDPD-44917: Fix UB in TxnSystemClient when adding max timeout to now
- Fixed a UB issue in TxnSystemClient by passing deadlines instead of timeouts.
- This issue manifested itself when a max timeout was to be added.
- CDPD-44835: Fix thirdparty build issues on Ubuntu 21.10
- Fixed third party build issues on ubuntu 21.10.
- Multiple issues lead to llvm build failures. New patch files were necessary to fix these
errors. One error was that the linux kernel removed the interface to cyclades which led to
a llvm build failure.
- CDPD-44833: Fix a scan bug that reads repetitive rows
- Fixed a scanner bug that would read repetitive rows.
- The bug would manifest itself when isFaultTolerant is true as lastPrimaryKey would not
be updated as part of the second scan request. In a common scenario when the tablet server
hosting the leader replica restarts, scanners will read the rows from the first
ScanResponse's lastPrimaryKey which would return repetitive rows.
- CDPD-44826: Fix prefetching bug in Java scanner
- Fixed a prefetching bug in the Java scanner.
- This bug manifested itself when the scanner would prefetch the value too early and it
would override the value. The fix is to use an atomic value to cache the value so the data
won't be overridden.
- CDPD-44793: Java client does not properly update master locations cache
- Fixed a bug in Kudu Java client where it could not invalidate stale locations of a
former leader master.
- The bug manifested itself when a master node had become unreachable due to network
issues and the client didn't receive RST on the connection to the master node. The client
would keep trying to connect to the unreachable leader master but could not receive
response until RPC timeout. Even when the master node was reachable again, the client
would still send RPCs through the old TCP connection and could not connect to the server
and the new leader master. The only way out was to restart the client application.
- CDPD-44788: Stop sending DeleteTablet RPC to wrong tablet server
- Kudu master no longer retries DeleteTablet RPC on tablet servers once the RPC is
responded with WRONG_SERVER_UUID.
- CDPD-42695: Back-port range-aware kudu cluster rebalance tool into 7.1.7 SP2
- The kudu cluster rebalance CLI tool has been improved to detect and fix the hot-spotting
issue for particular tables.
-
The earlier algorithm to place tablet replicas for a newly created table in Kudu
catalog manager is prone to hot-spotting if the table is partitioned simultaneously by
range and hash. That is because the algorithm does not discriminate based on the
tablet's key range. With that, all tablets of a table look the same for the algorithm,
and that could lead to hot-spotting if many tablet replicas from the same range (but
different hash buckets) are placed at the same tablet server.
The kudu cluster rebalance tool prior to the introduction of range-aware rebalancing
could not detect and fix the hot-spotting issue because it did not discriminate tablet
replicas based on the tablet's key range either. So, even if the distribution of
replicas is ideally balanced, there might be a hot-spotting due to the reasons cited
above even after running kudu cluster rebalance tool of prior versions.