Apache Kudu Fixed Issues
This topic includes the issues fixed in all generally available versions (GA) of Apache Kudu.
Continue reading:
- Issues Fixed in Kudu for CDH 5.16.2
- Issues Fixed in Kudu 1.7.0 / CDH 5.16.1
- Issues Fixed in Kudu 1.7.0 / CDH 5.15.2
- Issues Fixed in Kudu 1.7.0 / CDH 5.15.1
- Issues Fixed in Kudu 1.7.0 / CDH 5.15.0
- Issues Fixed in Kudu for CDH 5.14.4
- Issues Fixed in Kudu 1.6.0 / CDH 5.14.2
- Issues Fixed in Kudu 1.6.0 / CDH 5.14.0
- Issues Fixed in Kudu 1.6.0 / CDH 5.13.3
- Issues Fixed in Kudu 1.6.0 / CDH 5.13.2
- Issues Fixed in Kudu 1.5.0 / CDH 5.13.1
- Issues Fixed in Kudu 1.5.0 / CDH 5.13.0
- Issues Fixed in Kudu 1.4.0 / CDH 5.12.2
- Issues Fixed in Kudu 1.4.0 / CDH 5.12.1
- Issues Fixed in Kudu 1.4.0 / CDH 5.12.0
- Issues Fixed in Kudu 1.3.0 / CDH 5.11.2
- Issues Fixed in Kudu 1.3.0 / CDH 5.11.1
- Issues Fixed in Kudu 1.3.0 / CDH 5.11.0
- Issues Fixed in Kudu 1.2.0 / CDH 5.10.2
- Issues Fixed in Kudu 1.2.0 / CDH 5.10.1
- Issues Fixed in Kudu 1.2.0 / CDH 5.10.0
Issues Fixed in Kudu for CDH 5.16.2
- KUDU-1678 - Fixed a crash caused by a race condition between altering tablet schemas and deleting tablet replicas.
- KUDU-2195 - Now you can use the ‑‑cmeta_force_fsync flag to fsync Kudu’s consensus metadata more aggressively. Setting this to true may decrease Kudu’s performance, but will improve its durability in the face of power failures and forced shutdowns. The issue was much more likely to happen when Kudu was running on XFS.
- KUDU-2463 - Fixed an issue in which incorrect results would be occasionally returned in scans following a server restart.
- CDH-76920 - Fixed the issue where the Kudu CLI crashes when running the 'kudu cluster rebalance' sub-command on some platforms.
Issues Fixed in Kudu 1.7.0 / CDH 5.16.1
- KUDU-2260 - Fixed a rare issue where system failure could leave unexpected null bytes at the end of metadata files, causing Kudu to be unable to restart.
- KUDU-2364 - Fixed an issue when a tablet server was wiped and recreated with the same RPC address, ksck listed it twice, both as healthy, even though only one of them was there.
- KUDU-2412 - The kudu-python client can now compile in environments where __int128 is not supported. This was most commonly el6 environments.
- KUDU-2509 - Fixed an issue that might result in a crash of a tablet server in case of a WAL replay error while bootstrapping a tablet.
- KUDU-2580 - Fixed authentication token reacquisition in the C++ client.
- Fixed an issue that caused the kudu CLI tool to unexpectedly exit when the connection to the master or tserver was abruptly closed.
Issues Fixed in Kudu 1.7.0 / CDH 5.15.2
- KUDU-2463 - Fixed an issue in which incorrect results would be returned in scans following a server restart.
- KUDU-2509 - Fixed an issue that might result in a crash of a tablet server in case of a WAL replay error while bootstrapping a tablet.
- KUDU-2580 - Fixed authentication token reacquisition in the C++ client.
Issues Fixed in Kudu 1.7.0 / CDH 5.15.1
- KUDU-2367 - Fixed an issue where a permanently failed tablet replica was not properly identified, which could cause the tablet not to re-replicate in very small clusters.
- KUDU-2377 - Fixed an issue that caused Kudu servers to fail to start when RLIMIT_NPROC=-1.
- KUDU-2378 - Fixed unaligned loads of int128 from rows.
- KUDU-2379 - Fixed an issue that caused secure Spark jobs to fail.
- KUDU-2416 - Fixed PartialRow.setMin.
- KUDU-2443 - Fixed replica movement and replacement for RF=1.
- KUDU-2447 - Fixed the tablet server crash with the error, "NONE predicate can not be pushed into key".
- KUDU-2478 - Restored Python 2.6 compatibility.
- Added the ability to adjust scan timeouts in Spark.
- Increased the timeout to begin tablet copies, which improves Kudu's re-replication time when the cluster is busy.
- Fixed a NullPointerException thrown when calling ColumnSchema#toString on non-decimal types.
- Greatly improved the performance of many types of queries on tables from which many rows have been deleted.
- Fixed an issue that caused partition pruning to be too conservative for queries from the Java client that use Decimal predicates.
Issues Fixed in Kudu 1.7.0 / CDH 5.15.0
- KUDU-1613 - Fixed a scenario where the on-disk data of a tablet server was completely erased and a new tablet server was started on the same host. This issue could prevent tablet replicas previously hosted on the server from being evicted and re-replicated. Tablets now immediately evict replicas that respond with a different server UUID than expected.
- KUDU-1927 - Fixed a rare race condition when connecting to masters during their startup which might cause a client to get a response without a CA certificate and/or authentication token. This would cause the client to fail to authenticate with other servers in the cluster. The leader master now always sends a CA certificate and an authentication token (when applicable) to a Kudu client with a successful ConnectToMaster response.
- KUDU-2262 - The Kudu Java client now will retry a connection if no master is discovered as a leader, and the user has a valid authentication token. This avoids failure in recoverable cases when masters are in the process of the very first leader election after starting up.
- KUDU-2264 -The Java client will now automatically attempt to re-acquire Kerberos credentials from the ticket cache when the prior credentials are about to expire. This allows client instances to persist longer than the expiration time of a single Kerberos ticket so long as some other process renews the credentials in the ticket cache. Documentation on interacting with Kerberos authentication has been added to the Javadoc for the AsyncKuduClient class.
- KUDU-2265 - Follower masters are now able to verify authentication tokens even if they have never been a leader. Prior to this fix, if a follower master had never been a leader, clients would be unable to authenticate to that master, resulting in spurious error messages being logged.
- KUDU-2295 - Fixed a tablet server crash when a tablet replica is deleted during a scan.
- KUDU-2312 - The evaluation order of predicates in scans with multiple predicates has been made deterministic. Due to a bug, this was not necessarily the case previously. Predicates are applied in most to least selective order, with ties broken by column index. The evaluation order may change in the future, particularly when better column statistics are made available internally.
- KUDU-2331 - Previously, the kudu tablet change_config move_replica tool required all tablet servers in the cluster to be available when performing a move. This restriction has been relaxed: only the tablet server that will receive a replica of the tablet being moved and the hosts of the tablet’s existing replicas need to be available for the move to occur.
- KUDU-2343 - Fixed a bug in the Java client which prevented the client from locating the new leader master after a leader failover in the case that the previous leader either remained online or restarted quickly. This bug resulted in the client timing out operations with errors indicating that there was no leader master.
- KUDU-2259 - The Unix process username of the client is now included inside the exported security credentials, so that the effective username of clients who import credentials and subsequently use unauthenticated (SASL PLAIN) connections matches the client who exported the security credentials. For example, this is useful to let the Spark executors know which username to use if the Spark driver has no authentication token. This change only affects clusters with encryption disabled using --rpc-encryption=disabled.
Issues Fixed in Kudu for CDH 5.14.4
- KUDU-2331 - The `kudu tablet change_config move_replica` tool no longer fails for an unavailable tablet server if the tablet server is not the source or destination of the move.
- Improved the speed of scans in cases where a large number of rows have previously been deleted from the table.
Issues Fixed in Kudu 1.6.0 / CDH 5.14.2
- KUDU-1613 - Fixed an issue where a reformatted server could prevent re-replication of the replicas it previously hosted.
- KUDU-2238 - Fixed an issue where large updates were not flushed to disk under memory pressure.
- KUDU-2251 - Fixed a crash due to an overflow bug that could occur when a tablet was updated very frequently.
- KUDU-2274 - Fixed an issue where copying over a tombstoned replica could produce a replica in an inconsistent state.
- KUDU-2343 - Fixed an issue where the Java client would fail to connect to the leader master after a leadership change.
Issues Fixed in Kudu 1.6.0 / CDH 5.14.0
- KUDU-1078 - Fixed an error message commonly found in tablet server logs indicating that operations were being read "from the future".
- KUDU-1411 - HybridTime timestamp propagation now works in the Java client when using scan tokens.
- KUDU-2044 - Tombstoned tablets no longer report metrics.
- KUDU-2173 - Fixed a bug in the C++ client which could cause tablets to be erroneously pruned, or skipped, during certain scans, resulting in fewer results than expected being returned from queries. The bug only affected tables whose range partition columns are a proper prefix of the primary key.
- KUDU-2188 - Published Kudu Java artifacts are now fully compatible with JRE 7 and JRE 8. There was previously a bug in the release process which made them compatible only with JRE 8.
- Fixed a typo in the list of default TLS ciphers used by Kudu servers. As a result, two additional cipher suites are now available:
- ECDHE-RSA-AES128-SHA256 TLSv1.2 Kx=ECDH Au=RSA Enc=AES(128) Mac=SHA256
- AES256-GCM-SHA384 TLSv1.2 Kx=RSA Au=RSA Enc=AESGCM(256) Mac=AEAD
- KUDU-2231 - Sparse column predicates no longer cause excessive data-block reads.
Issues Fixed in Kudu 1.6.0 / CDH 5.13.3
- KUDU-2238 - Fixed a bug where large updates were not flushed to disk, even when the server is under memory pressure..
- KUDU-2274 - Fixed a very rare bug where a tombstone being copied over could produce a tablet replica in an inconsistent state.
- KUDU-2343 - Java client doesn't properly reconnect to leader master when old leader is online.
Issues Fixed in Kudu 1.6.0 / CDH 5.13.2
Issues Fixed in Kudu 1.5.0 / CDH 5.13.1
- KUDU-1788 - Increase Raft RPC timeout to 30sec to avoid fruitless retries.
- KUDU-2130 - (part 2): more fixes for ITClientStress
- KUDU-2130 - java client: handle termination during negotiation edge case
- KUDU-2167 - fix C++ client crash due to bad assumption regarding scan data
- KUDU-2170 - Masters can start with duplicates specified
- KUDU-2173 - Partitions are incorrectly pruned when range-partitioned on a PK prefix
- KUDU-2188 - restore Java 7 compatibility to artifacts built with JDK8
- KUDU-2209 - HybridClock doesn't handle changes in STA_NANO flag
Issues Fixed in Kudu 1.5.0 / CDH 5.13.0
-
The Java Kudu client now automatically requests new authentication tokens after expiration. As a result, long-lived Java clients are now supported. See KUDU-2013 for more details.
-
Multiple Kerberos compatibility bugs have been fixed, including support for environments with disabled reverse DNS, FreeIPA compatibility, principal names including uppercase characters, and hosts without a FQDN.
-
A bug in the binary prefix decoder which could cause a tablet server 'check' assertion crash has been fixed. The crash could only be triggered in very specific scenarios; see KUDU-2085 for additional details.
- This is a complete list of upstream issues fixed in Kudu 1.5.0 / CDH 5.13.0. For the full list of fixed issues for all CDH components in CDH 5.13, see Upstream Issues Fixed in CDH 5.13.
- KUDU-871 - Allow tombstoned tablets to vote
- KUDU-1125 - (part 1) catalog_manager: try to avoid unnecessarily rewriting tablet info
- KUDU-1407 - reassign failed tablets
- KUDU-1442 - log number of open log block containers
- KUDU-1544 - Race in Java client's AsyncKuduSession.apply()
- KUDU-1726 - Avoid fsync-per-block in tablet copy
- KUDU-1755 - Part 1: Improve tablet on disk size metric
- KUDU-1811 - C++ client: use larger batches when fetching scan tokens
- KUDU-1863 - improve overall safety of graceful server shutdown
- KUDU-1865 - Avoid heap allocation for payload slices
- KUDU-1894 - fixed deadlock in client.Connection
- KUDU-1911 - improve missing required arg message
- KUDU-1929 - [rpc] Allow using encrypted private keys for TLS
- KUDU-1942 - Kerberos fails to log in on hostnames with capital letters
- KUDU-1943 - Add BlockTransaction to Block Manager
- KUDU-1952 - Remove round-robin for block placement
- KUDU-1955 - refuse to use world-readable keytabs
- KUDU-2004 - Undefined behavior in TlsSocket::Writev()
- KUDU-2013 - Support for long lived auth tokens in Java client
- KUDU-2032 - Kerberos authentication fails with rdns disabled in krb5.conf
- KUDU-2039 - Fix the table count in the /tables page of master webUI
- KUDU-2041 - Fix negotiation deadlock
- KUDU-2049 - Fix too-strict CHECK in RleIntBlockDecoder::SeekToPositionInBlock
- KUDU-2053 - Fix race in Java RequestTracker
- KUDU-2058 - Fix LocatedTablet string comparisons
- KUDU-2060 - Show primary keys in the master's table web UI page
- KUDU-2066 - Add experimental Gradle build support
- KUDU-2067 - Enable cfile checksumming by default
- KUDU-2072 - upgrade to cmake 3.9.0 breaks sles12sp0 cmake patch
- KUDU-2078 - Sink failure if batch size > session's flush buffer size
- KUDU-2083 - Decrement running maintenance ops on failed prepare
- KUDU-2085 - Fix crash when seeking past end of prefix-encoded blocks
- KUDU-2087 - Fix failure to map Kerberos principal to username with FreeIPA
- KUDU-2088 - Synchronizer may not go out of scope with outstanding references
- KUDU-2091 - Certificates with intermediate CA's do not work with Kudu
- KUDU-2101 - Include a table summary at the bottom
- KUDU-2102 - fix PosixRWFile::Sync to guarantee durability when used concurrently
- KUDU-2103 - [java]Canonicalize hostnames in client
- KUDU-2104 - Upgrade to Spark 2.2.0
- KUDU-2114 - Don't re-delete tombstoned replicas
- KUDU-2118 - Fully shut down TabletReplica on delete
- KUDU-2123 - Auto-vivify cmeta on tombstoned replicas if doesn't exist at startup
- KUDU-2131 - switch to LIFO log container retrieval
- KUDU-2138 - delete failed replicas in tablet report
- KUDU-2141 - master: Remove DCHECK when tablet report has no opid_index
Issues Fixed in Kudu 1.4.0 / CDH 5.12.2
Apache Kudu 1.4.0 / CDH 5.12.2 is a bug-fix release which includes the following fixes:
- KUDU-2209 - Fixed Kudu's handling of the STA_NANO flag, which caused spurious crashes due to one Kudu node thinking another node had a timestamp from the future.
- KUDU-1788 - Increased the Raft RPC timeout to 30 seconds to avoid unsuccessful retries.
- KUDU-2170 - Disallowed specifying the same address multiple times in Kudu's master list.
- KUDU-2083 - Fixed an issue where, in rare circumstances, Kudu reduced the amount of maintenance ops it could run concurrently. The issue could cause the Kudu process to run out of memory and crash.
- KUDU-1942 - Now Kerberos can successfully log in on hostnames with capital letters.
- KUDU-2173 - Fixed an issue where scans with a predicate on a prefix of a range partition key could fail to return matching values.
- KUDU-2167 - Fixed an issue where the C++ client would crash when the server filtered out all data in a scan RPC.
- KUDU-2032 - Kerberos authentication no longer fails when rdns = false is configured in krb5.conf.
- Fixed a segmentation fault that could occur when running the ksck command against a master soon after the master was started.
Issues Fixed in Kudu 1.4.0 / CDH 5.12.1
Apache Kudu 1.4.0 / CDH 5.12.1 is a bug-fix release which includes the following fixes:
- KUDU-2085 - Fixed a bug that caused crashes when seeking past the end of prefix-encoded blocks.
- KUDU-2087 - Fixed an issue where Kudu would fail to start when Kerberos was enabled in FreeIPA-configured deployments.
- KUDU-2053 - Fixed a race condition in the Java RequestTracker.
- KUDU-2049 - Fixed an issue where scans on RLE-encoded integer columns would sometimes cause CHECK failures due to the CHECK condition being too strict.
- KUDU-2052 - Kudu now uses XFS_IOC_UNRESVSP64 ioctl to punch holes on xfs filesystems. This fixes an issue with slow startup times on xfs when hole punching was done via fallocate().
- KUDU-1956 - Kudu will no longer crash if faced with a race condition when selecting rowsets for compaction.
Issues Fixed in Kudu 1.4.0 / CDH 5.12.0
- KUDU-2020 - Fixed an issue where re-replication after a failure would proceed significantly slower than expected. This bug caused many tablets to be unnecessarily copied multiple times before successfully being considered re-replicated, resulting in significantly more network and IO bandwidth usage than expected. Mean time to recovery on clusters with large amounts of data is improved by up to 10x by this fix.
- KUDU-1982 - Fixed an issue where the Java client would call NetworkInterface.getByInetAddress very often, causing performance problems particularly on Windows where this function can be quite slow.
- KUDU-1755 - Improved the accuracy of the on_disk_size replica metrics to include the size consumed by bloom filters, primary key indexes, superblock metadata, and delta files. Note that because the size metric is now more accurate, the reported values are expected to increase after upgrading to Kudu 1.4.0. This does not indicate that replicas are using more space after the upgrade; rather, it is now accurately reporting the amount of space that has always been used.
- KUDU-1192 - Kudu servers will now periodically flush their log messages to disk even if no WARNING-level messages have been logged. This makes it easier to tail the logs to see progress output during normal startup.
Issues Fixed in Kudu 1.3.0 / CDH 5.11.2
Apache Kudu 1.3.0 / CDH 5.11.2 is a bug-fix release which includes the following fixes:
- KUDU-2053 - Fixed a race condition in the Java RequestTracker.
- KUDU-2049 - Fixed an issue where scans on RLE-encoded integer columns would sometimes cause CHECK failures due to the CHECK condition being too strict.
- KUDU-1963 - Fixed an issue where the Java client misdiagnoses an error and logs a NullPointerException when a connection is closed by client while a negotiation is in progress.
- KUDU-1853 - Fixed an issue where data blocks could be orphaned after a failed tablet copy.
Issues Fixed in Kudu 1.3.0 / CDH 5.11.1
Apache Kudu 1.3.0 / CDH 5.11.1 is a bug-fix release which includes the following fixes:
-
KUDU-1999 - Fixed an issue where the Kudu Spark connector would fail to kinit with the principal and keytab provided to a job.
-
KUDU-1993 - Fixed a validation issue with grouped gflags.
-
KUDU-1981 - Fixed an issue where Kudu server components would fail to start on machines with fully-qualified domain names longer than 64 characters when security was enabled. This was due to hard-coded restrictions in the OpenSSL library.
- KUDU-1607 - Fixed a case in which a tablet replica on a tablet server could retain blocks of data which prevented it from being fully deleted.
- KUDU-1933 - Fixed an issue in which a tablet server would crash and fail to restart after a single tablet received more than two billion write operations.
- KUDU-1964 - Fixed a performance degradation issue caused by OpenSSL locks under high concurrency.
Issues Fixed in Kudu 1.3.0 / CDH 5.11.0
- KUDU-1968 - Fixed an issue in which the tablet server would delete an incorrect set of data blocks after an aborted attempt to copy a tablet from another server. This would produce data loss in unrelated tablets.
- KUDU-1962 - Fixed a NullPointerException in the Java client in the case that the Kudu master is overloaded at the time the client requests location information. This could cause client applications to hang indefinitely regardless of configured timeouts.
-
KUDU-1893 - Fixed a critical bug in which wrong results would be returned when evaluating predicates applied to columns added using the ALTER TABLE operation.
-
KUDU-1905 - Fixed an issue where Kudu would crash after reinserts that resulted in an empty change list. This occurred in cases where the primary key was composed of all columns.
-
KUDU-1899 - Fixed a crash that occurred after inserting a row with an empty string as the single-column primary key.
-
KUDU-1904 - Fixed a potential crash when performing random reads against a column using RLE encoding and containing long runs of NULL values.
-
KUDU-1856 - Fixed an issue in which disk space could be leaked by Kudu servers storing data on partitions using the XFS file system. Any leaked disk space will now be automatically recovered upon upgrade.
-
KUDU-1888, KUDU-1906 - Fixed multiple issues in the Java client where operation callbacks would never be triggered, causing the client to hang.
Issues Fixed in Kudu 1.2.0 / CDH 5.10.2
Apache Kudu 1.2.x / CDH 5.10.2 includes the following fixed issues.
- KUDU-1933 - Fixed an issue that truncated the 64-bit log index in the OpId to 32 bits, causing overflow of the log index.
- KUDU-1607 - Fixed an issue where Kudu could not delete failed tablets using the DROP TABLE command.
- KUDU-1905 - Allow reinserts on tables when all columns are part of the primary key.
- KUDU-1893 - Made a fix to avoid incorrect NULL results and ensure evaluation of predicates for columns added after table creation.
Issues Fixed in Kudu 1.2.0 / CDH 5.10.1
Apache Kudu 1.2.x / CDH 5.10.1 includes the following fixed issues.
- KUDU-1904 - Fixed a bug where RLE columns with only NULL values would crash on scan.
- KUDU-1899 - Fixed an issue where tablet servers would crash after inserting an empty string primary key ("").
- KUDU-1851 - Fixed an issue with the Python client which would crash whenever a TableAlterer is instantiated directly.
- KUDU-1852 - KuduTableAlterer will no longer crash when given nullptr range bound arguments.
- KUDU-1821 - Improved warnings when the catalog manager starts.
Issues Fixed in Kudu 1.2.0 / CDH 5.10.0
See Issues resolved for Kudu 1.2.0 and Git changes between 1.1.x and 1.2.x.
- KUDU-1508 - Fixed a long-standing issue in which running Kudu on ext4 file systems could cause file system corruption. While this issue has been known to still manifest in certain rare cases, the corruption is harmless and can be repaired as part of a regular fsck. Switching from ext4 to xfs will also solve the problem.
- KUDU-1399 - Implemented an LRU cache for open files, which prevents running out of file descriptors on long-lived Kudu clusters. By default, Kudu will limit its file descriptor usage to half of its configured ulimit.
- Gerrit #5192 - Fixed an issue which caused data corruption and crashes in the case that a table had a non-composite (single-column) primary key, and that column was specified to use DICT_ENCODING or BITSHUFFLE encodings. If a table with an affected schema was written in previous versions of Kudu, the corruption will not be automatically repaired; users are encouraged to re-insert such tables after upgrading to Kudu 1.2 or later.
- Gerrit #5541 - Fixed a bug in the Spark KuduRDD implementation which could cause rows in the result set to be silently skipped in some cases.
- KUDU-1551 - Fixed an issue in which the tablet server would crash on restart in the case that it had previously crashed during the process of allocating a new WAL segment.
- KUDU-1764 - Fixed an issue where Kudu servers would leak approximately 16-32MB of disk space for every 10GB of data written to disk. After upgrading to Kudu 1.2 or later, any disk space leaked in previous versions will be automatically recovered on startup.
- KUDU-1750 - Fixed an issue where the API to drop a range partition would drop any partition with a matching lower _or_ upper bound, rather than any partition with matching lower _and_ upper bound.
- KUDU-1766 - Fixed an issue in the Java client where equality predicates which compared an integer column to its maximum possible value (e.g. Integer.MAX_VALUE) would return incorrect results.
- KUDU-1780 - Fixed the kudu-client Java artifact to properly shade classes in the com.google.thirdparty namespace. The lack of proper shading in prior releases could cause conflicts with certain versions of Google Guava.
- Gerrit #5327 - Fixed shading issues in the kudu-flume-sink Java artifact. The sink now expects that Hadoop dependencies are provided by Flume, and properly shades the Kudu client's dependencies.
- Fixed a few issues using the Python client library from Python 3.