Impala integration limitations

Here are the limitations when integrating Kudu with Impala, and list of Impala keywords that are not supported for creating Kudu tables.

  • When creating a Kudu table, the CREATE TABLE statement must include the primary key columns before other columns, in primary key order.

  • Impala cannot update values in primary key columns.

  • Impala cannot create Kudu tables with VARCHAR or nested-typed columns.

  • Kudu tables with a name containing upper case or non-ASCII characters must be assigned an alternate name when used as an external table in Impala.

  • Kudu tables with a column name containing upper case or non-ASCII characters cannot be used as an external table in Impala. Columns can be renamed in Kudu to work around this issue.

  • != and LIKE predicates are not pushed to Kudu, and instead will be evaluated by the Impala scan node. This may decrease performance relative to other types of predicates.

  • Updates, inserts, and deletes using Impala are non-transactional. If a query fails part of the way through, its partial effects will not be rolled back.

  • The maximum parallelism of a single query is limited to the number of tablets in a table. For good analytic performance, aim for 10 or more tablets per host or use large tables.

Impala keywords not supported for creating Kudu tables

  • PARTITIONED
  • LOCATION
  • ROWFORMAT