What's New in Apache Kudu
This topic lists new features for Apache Kudu in this release of Cloudera Runtime.
Multiple tablet ids in 'local_replica delete'
local_replica delete' tool allows multiple tablet identifiers to be
specified and processed at once. This helps to reduce the overall latency caused by that
opening tablet server’s metadata takes significant time.
Adding --ignore_nonexistent for 'local_replica delete'
--ignore_nonexistent flag was added for the '
delete' tool. This makes the real-world scripting scenarios easier if trying to
clean up tablet servers of particular tablet replicas.
KuduContect track operations per table
Adds the ability to track operation counts per table. Introduces the MapAccumulator to track these metrics in a single accumulator per operation type.
Support columnar row format in Java client
setRowDataFormat() method is added to
AsyncKuduScanner. The Java client now supports the columnar RPC format.
The format can be set through the
setRowDataFormat() method on the
Check range predicate first while evaluating Bloom filter predicate
Range predicates can be specified along with Bloom filter predicates for the same column. It is more effective to check against range predicates and exit early if the column value is out of bounds compared to computing hash and then looking up the value in Bloom filter.
Arenas for RPC request and response
RPC server side allocates a protobuf Arena for each request. The request RPC and response
are allocated from the Arena, ensuring that any sub-messages, strings, repeated fields, and
so on, use that Arena for allocation as well. Everything is deleted en-masse when the
InboundCall object (which owns the Arena) is destroyed.
New metadata to avoid master when using scan tokens
A new metadata is added to the scan token to allow it to contain all of the metadata
required to construct a KuduTable and open a scanner in the clients. This means the
GetTableLocations RPC calls to the
master are no longer required when using the scan token.
TabletMetadataPB, and authorization
token fields were added as optional fields on the token. Additionally a
projected_column_idx` field was added that can be used in place of the
projected_columns`. This significantly reduces the size of the scan
token by not duplicating the
ColumnSchemaPB that is already in the
Adding the table metadata to the scan token is enabled by default. However,it can be
disabled in rare cases where more resiliency to column renaming is desired.It can be
dsiabley in the kudu-spark integration using the
RaftConsensus::DumpStatusHtml() does not block Raft consensus activity
kudu::consensus::RaftConsensus::CheckLeadershipAndBindTerm() needs to take
the lock to check the term and the Raft role. When many RPCs come in for the same tablet,
the contention can hog service threads and cause queue overflows on busy systems. With this
RaftConsensus::DumpStatusHtml() no longer blocks Raft
consensus activity and is not blocked by it either.