Using Apache Impala with Apache KuduPDF version

Known issues and limitations

Here are the limitations when integrating Kudu with Impala, and list of Impala keywords that are not supported for creating Kudu tables.

  • When creating a Kudu table, the CREATE TABLE statement must include the primary key columns before other columns, in primary key order.

  • Impala cannot update values in primary key columns.

  • Impala cannot create Kudu tables with VARCHAR or nested-typed columns.

  • Kudu tables with a name containing upper case or non-ASCII characters must be assigned an alternate name when used as an external table in Impala.

  • Kudu tables with a column name containing upper case or non-ASCII characters cannot be used as an external table in Impala. Columns can be renamed in Kudu to work around this issue.

  • != and LIKE predicates are not pushed to Kudu, and instead will be evaluated by the Impala scan node. This may decrease performance relative to other types of predicates.

  • Updates, inserts, and deletes using Impala are non-transactional. If a query fails part of the way through, its partial effects will not be rolled back.

  • The maximum parallelism of a single query is limited to the number of tablets in a table. For good analytic performance, aim for 10 or more tablets per host or use large tables.

Impala keywords not supported for creating Kudu tables

  • PARTITIONED
  • LOCATION
  • ROWFORMAT