Manage one-dimensional arrays in Kudu
Kudu supports one-dimensional arrays of scalar types, such as integers, strings, booleans, and binaries, directly within columns. This feature enables more expressive schema designs and improves query performance by removing the need to denormalize data or manage child tables with complex joins.
Key Benefits
- Previously, array data, such as device telemetry, required splitting into numerous columns by index to bypass Kudu’s default 300-column limit. You can now store arrays natively in a single column, which simplifies the schema and avoids complex denormalization.
- You do not need to reassemble split data during read operations because applications can use native array types directly.
System limits
- Hard limit: Each array cell is restricted to a maximum of 65,535 elements due to storage encoding constraints.
- Configurable limits: Administrators can restrict the maximum number of array elements
in a column cell by using the
--array_cell_max_elem_numflag. This flag is set to 1024 by default. Setting this limit mitigates performance issues and enforces storage policies. If a client application submits an operation to a Kudu tablet server that exceeds this limit, the server returns an error to the client.
