Schema design limitations

Kudu currently has some known limitations that may factor into schema design.

Primary key

  • The primary key cannot be changed after the table is created. You must drop and recreate a table to select a new primary key.

  • The columns which make up the primary key must be listed first in the schema.

  • The primary key of a row cannot be modified using the UPDATE functionality. To modify a row’s primary key, the row must be deleted and re-inserted with the modified key. Such a modification is non-atomic.

  • Columns with DOUBLE, FLOAT, or BOOL types are not allowed as part of a primary key definition. Additionally, all columns that are part of a primary key definition must be NOT NULL.

  • Auto-generated primary keys are not supported.

  • Cells making up a composite primary key are limited to a total of 16KB after internal composite-key encoding is done by Kudu.

Cells

No individual cell may be larger than 64KB before encoding or compression. The cells making up a composite key are limited to a total of 16KB after the internal composite-key encoding done by Kudu. Inserting rows not conforming to these limitations will result in errors being returned to the client.

Columns

  • By default, Kudu tables can have a maximum of 300 columns. We recommend schema designs that use fewer columns for best performance.

  • CHAR, DATE, and complex types such as ARRAY, MAP, and STRUCT are not supported.

  • Type, nullability, and type attributes (i.e. precision and scale of DECIMAL, length of VARCHAR) of the existing columns cannot be changed by altering the table.

  • Dropping a column does not immediately reclaim space. Compaction must run first.

  • Kudu does not allow the type of a column to be altered after the table is created.

Tables

  • Tables must have an odd number of replicas, with a maximum of 7.

  • Replication factor (set at table creation time) cannot be changed.

  • There is no way to run compaction manually, but dropping a table will reclaim the space immediately.

  • Kudu does not allow you to change how a table is partitioned after creation, with the exception of adding or dropping range partitions.

  • Partitions cannot be split or merged after table creation.

Other usage limitations

  • Secondary indexes are not supported.

  • Multi-row transactions are not supported.

  • Relational features, such as foreign keys, are not supported.

  • Identifiers such as column and table names are restricted to be valid UTF-8 strings. Additionally, a maximum length of 256 characters is enforced.

    If you are using Apache Impala to query Kudu tables, refer to the section on Impala integration limitations as well.

  • Deleted row disk space cannot be reclaimed. The disk space occupied by a deleted row is only reclaimable via compaction, and only when the deletion's age exceeds the "tablet history maximum age" which is controlled by the --tablet_history_max_age_sec flag. Currently, Kudu only schedules compactions in order to improve read/write performance. A tablet will never be compacted purely to reclaim disk space. As such, range partitioning should be used when it is expected that large swaths of rows will be discarded. With range partitioning, individual partitions may be dropped to discard data and reclaim disk space. See KUDU-1625 for more details.