Impala integration limitations
Here are the limitations when integrating Kudu with Impala, and list of Impala keywords that are not supported for creating Kudu tables.
-
When creating a Kudu table, the
CREATE TABLE
statement must include the primary key columns before other columns, in primary key order. -
Impala cannot update values in primary key columns.
-
Impala cannot create Kudu tables with
VARCHAR
or nested-typed columns. -
Kudu tables with a name containing upper case or non-ASCII characters must be assigned an alternate name when used as an external table in Impala.
-
Kudu tables with a column name containing upper case or non-ASCII characters cannot be used as an external table in Impala. Columns can be renamed in Kudu to work around this issue.
-
!=
andLIKE
predicates are not pushed to Kudu, and instead will be evaluated by the Impala scan node. This may decrease performance relative to other types of predicates. -
Updates, inserts, and deletes using Impala are non-transactional. If a query fails part of the way through, its partial effects will not be rolled back.
-
The maximum parallelism of a single query is limited to the number of tablets in a table. For good analytic performance, aim for 10 or more tablets per host or use large tables.
Impala keywords not supported for creating Kudu tables
-
PARTITIONED
-
LOCATION
-
ROWFORMAT