What's New in Apache Kudu

Learn about the new features of Kudu in Cloudera Runtime 7.2.15.

New tool to remove the dead tablet server
A new tool kudu tserver unregister is added to remove a dead tablet server from the cluster without restarting the masters. For more information, see KUDU-2915.
New column adding tool
A new tool kudu table add_column is added to add columns to existing tables. For more information, see KUDU-3339.
Tracking startup progress
It’s now possible to track startup progress of a Kudu server on the /startup page on the web UI. There are also metrics added to track the overall server startup progress as well as the processing of the log block containers and starting of the tablets. For more information, see KUDU-1959.

Improvements

  • KUDU-3240: Client-side connection negotiation timeout is now configurable in the Java client
  • KUDU-3328: The rebalancer tool now does not move replicas to tablet servers in maintenance mode
  • KUDU-3340: It is now possible to disable compaction on a particular table.
  • KUDU-3342: the kudu remote_replica list CLI tool now displays the data state and last status for a tablet replica.
  • KUDU-3344: Kudu master cleans up metadata for deleted tables
  • Table entity is now accessible in KuduWriteOperation in the C++ client, making understanding errors on the client side easier. For details, see KUDU-2623.
  • Added pagination and search to the Tables page generated by the Kudu embedded Web server.
  • The LZ4 library (Kudu uses it to compress various data on disk) has been upgraded to the 1.9.3 version to benefit from improved performance. For more information, see https://github.com/lz4/lz4/releases/tag/v1.9.3
  • Fsync is now called on each modification of metadata files Kudu data directories are backed by XFS. The newly introduced –cmeta_fsync_override_on_xfs can be used to control this behavior.
  • The log4j package used by the ranger-client plugin is upgraded to the 2.17.1 version.
  • Run intra-location rebalancing in parallel: Intra-location rebalancing now runs concurrently at different locations for location-aware Kudu clusters. As a location-aware Kudu cluster automatically consists of non-intersecting groups of tablet servers, replicas within each location can be moved independently. Running intra-location rebalancing concurrently at every location can shorten the runtime of the rebalancer tool up to N times compared with running sequentially, where N is the number of locations defined in a Kudu cluster.