Impala DDL for Kudu
You can use the Impala CREATE TABLE
and
ALTER TABLE
statements to create and fine-tune the
characteristics of Kudu tables. Impala supports specific features and
properties that only apply to Kudu tables.
Non-unique Primary Keys for Kudu Tables
Kudu now allows a user to create a non-unique primary key for a table when creating a table. The data engine handles this by appending a system generated auto-incrementing column to the non-unique primary key columns. This is done to guarantee the uniqueness of the primary key. This auto-incrementing column is named as 'auto_incrementing_id' with bigint type and this column is only system generated and cannot be explicitly created by the user. This auto_incrementing_id column is unique across a partition/tablet i.e. every partition/tablet would have this column starting from one and incrementing monotonically. The assignment to this column during insertion is automatic.
Create a Kudu Table with a non-unique PRIMARY KEY
The following example shows creating a table with a non-unique PRIMARY KEY.
CREATE TABLE kudu_tbl1
(
id INT NON UNIQUE PRIMARY KEY,
name STRING
)
PARTITION BY HASH (id) PARTITIONS 3 STORED as KUDU;
The effective PRIMARY KEY in the above case will be {id, auto_increment_id}
Verify the PRIMARY KEY is non-unique
You can now check the PRIMARY KEY created is non-unique by running the following DESCRIBE command. A new property "key_unique" shows if the primary key is unique. System generated column "auto_incrementing_id" is shown in the output for the table as a non-unique primary key.
describe kudu_tbl1
+----------------------+--------+---------+-------------+------------+----------+---------------+---------------+---------------------+------------+
| name | type | comment | primary_key | key_unique | nullable | default_value | encoding | compression | block_size |
+----------------------+--------+---------+-------------+------------+----------+---------------+---------------+---------------------+------------+
| id | int | | true | false | false | | AUTO_ENCODING | DEFAULT_COMPRESSION | 0 |
| auto_incrementing_id | bigint | | true | false | false | | AUTO_ENCODING | DEFAULT_COMPRESSION | 0 |
| name | string | | false | | true | | AUTO_ENCODING | DEFAULT_COMPRESSION | 0 |
+----------------------+--------+---------+-------------+------------+----------+---------------+---------------+---------------------+------------+
Fetched 3 row(s) in 4.72s
Query Auto Incrementing Column
When you query a table using the SELECT statement, it will not display the system generated auto incrementing column unless the column is explicitly specified in the select list.
Create a Kudu table without a PRIMARY KEY attribute
You can create a Kudu table without specifying a PRIMARY KEY or a PARTITION KEY since they are optional, however you cannot create a Kudu table without specifying both PRIMARY KEY and PARTITION KEY. If you do not specify the primary key attribute, the partition key columns can be promoted as a non-unique primary key. This is possible only if those columns are the beginning columns of the table.
In the following example, 'a' and 'b' will be promoted as a non-unique primary key, 'auto_incrementing_id' column will be added by Kudu engine. 'a', 'b' and 'auto_incrementing_id' form the effective unique composite primary key.
CREATE TABLE auto_table
(
a BIGINT,
b STRING,
)
PARTITION BY HASH(a, b) PARTITIONS 2 STORED AS KUDU;
The effective primary key in this case would be {a, b, auto_incrementing_id}
Limitations
- UPSERT operation is not supported for Kudu tables with non-unique primary key. If you run an UPSERT statement for a Kudu table with a non-unique primary key it will fail with an error.
- Since the auto generated key for each row will be assigned after the row’s data is generated and after the row lands in the tablet, you cannot use this column in the partition key.
PRIMARY KEY Attribute
To create a Kudu table, use the STORED AS KUDU
clause in the
CREATE TABLE
statement.
The column list in a CREATE TABLE
statement can include the following
attributes, which only apply to Kudu tables:
[NON UNIQUE] PRIMARY KEY
| [NOT] NULL
| ENCODING codec
| COMPRESSION algorithm
| DEFAULT constant_expression
| BLOCK_SIZE number
The primary key for a Kudu table is a column, or set of columns, that uniquely identifies every row. The primary key value also is used as the natural sort order for the values from the table.
The notion of primary key only applies to Kudu tables. Every Kudu table requires a primary key.
Because all of the primary key columns must have non-null values, specifying a column in the PRIMARY KEY or NON-UNIQUE PRIMARY KEY clause implicitly adds the NOT NULL attribute to that column.
The primary key columns must be the first ones specified in the
CREATE TABLE
statement.
When the primary key is a single column, you can specify the
PRIMARY KEY
attribute either inline in a single
column definition, or as a separate clause at the end of the column
list.
For example:
CREATE TABLE pk_inline
(
col1 BIGINT PRIMARY KEY,
col2 STRING,
col3 BOOLEAN
) PARTITION BY HASH(col1) PARTITIONS 2 STORED AS KUDU;
CREATE TABLE pk_at_end
(
col1 BIGINT,
col2 STRING,
col3 BOOLEAN,
PRIMARY KEY (col1)
) PARTITION BY HASH(col1) PARTITIONS 2 STORED AS KUDU;
CREATE TABLE pk_inline
(
col1 BIGINT [NON UNIQUE] PRIMARY KEY,
col2 STRING,
col3 BOOLEAN
) PARTITION BY HASH(col1) PARTITIONS 2 STORED AS KUDU;
CREATE TABLE pk_at_end
(
col1 BIGINT,
col2 STRING,
col3 BOOLEAN,
[NON UNIQUE] PRIMARY KEY (col1)
) PARTITION BY HASH(col1) PARTITIONS 2 STORED AS KUDU;
The contents of the primary key columns cannot be changed by an
UPDATE
or UPSERT
statement.
Including too many columns in the primary key (more than 5 or 6) can
reduce the performance of write operations. Therefore, pick the most
selective and most frequently tested non-null columns for the primary
key specification. If a column must always have a value, but that value
might change later, leave it out of the primary key and use a
NOT NULL
clause for that column instead.
NULL | NOT NULL Attribute
For Kudu tables, you can specify which columns can contain nulls or
not. This constraint offers an extra level of consistency enforcement
for Kudu tables. If an application requires a field to always be
specified, include a NOT NULL
clause in the
corresponding column definition, and Kudu prevents rows from being
inserted with a NULL
in that column.
For example, a table containing geographic information might require
the latitude and longitude coordinates to always be specified. Other
attributes might be allowed to be NULL
. For example, a
location might not have a designated place name, its altitude might be
unimportant, and its population might be initially unknown, to be filled
in later.
For non-Kudu tables, Impala allows any column to contain
NULL
values, because it is not practical to enforce a
not null
constraint on HDFS data files that could be prepared
using external tools and ETL processes.
CREATE TABLE required_columns
(
id BIGINT PRIMARY KEY,
latitude DOUBLE NOT NULL,
longitude DOUBLE NOT NULL,
place_name STRING,
altitude DOUBLE,
population BIGINT
) PARTITION BY HASH(id) PARTITIONS 2 STORED AS KUDU;
During performance optimization, Kudu can use the knowledge that nulls
are not allowed to skip certain checks on each input row, speeding up
queries and join operations. Therefore, specify NOT
NULL
constraints when appropriate.
The NULL
clause is the default condition for all
columns that are not part of the primary key. You can omit it, or
specify it to clarify that you have made a conscious design decision to
allow nulls in a column.
Because primary key columns cannot contain any NULL
values, the NOT NULL
clause is not required for the
primary key columns, but you might still specify it to make your code
self-describing.
DEFAULT Attribute
You can specify a default value for columns in Kudu tables. The default value can be any constant expression, for example, a combination of literal values, arithmetic and string operations. It cannot contain references to columns or non-deterministic function calls.
The following example shows different kinds of expressions for the
DEFAULT
clause. The requirement to use a constant
value means that you can fill in a placeholder value such as
NULL
, empty string, 0, -1, 'N/A'
and
so on, but you cannot reference functions or column names. Therefore,
you cannot use DEFAULT
to do things such as
automatically making an uppercase copy of a string value, storing
Boolean values based on tests of other columns, or add or subtract one
from another column representing a sequence number.
CREATE TABLE default_vals
(
id BIGINT PRIMARY KEY,
name STRING NOT NULL DEFAULT 'unknown',
address STRING DEFAULT upper('no fixed address'),
age INT DEFAULT -1,
earthling BOOLEAN DEFAULT TRUE,
planet_of_origin STRING DEFAULT 'Earth',
optional_col STRING DEFAULT NULL
) PARTITION BY HASH(id) PARTITIONS 2 STORED AS KUDU;
ENCODING Attribute
Each column in a Kudu table can optionally use an encoding, a low-overhead form of compression that reduces the size on disk, then requires additional CPU cycles to reconstruct the original values during queries. Typically, highly compressible data benefits from the reduced I/O to read the data back from disk.
-
AUTO_ENCODING
: Use the default encoding based on the column type, which are bitshuffle for the numeric type columns and dictionary for the string type columns. -
PLAIN_ENCODING
: Leave the value in its original binary format. -
RLE
: Compress repeated values (when sorted in primary key order) by including a count. -
DICT_ENCODING
: When the number of different string values is low, replace the original string with a numeric ID. -
BIT_SHUFFLE
: Rearrange the bits of the values to efficiently compress sequences of values that are identical or vary only slightly based on primary key order. The resulting encoded data is also compressed with LZ4. -
PREFIX_ENCODING
: Compress common prefixes in string values; mainly for use internally within Kudu.
COMPRESSION Attribute
You can specify a compression algorithm to use for each column in a
Kudu table. This attribute imposes more CPU overhead when retrieving the
values than the ENCODING
attribute does. Therefore, use
it primarily for columns with long strings that do not benefit much from
the less-expensive ENCODING
attribute.
The choices for COMPRESSION
are LZ4
,
SNAPPY
, and ZLIB
.
BLOCK_SIZE Attribute
Although Kudu does not use HDFS files internally, and thus is not
affected by the HDFS block size, it does have an underlying unit of I/O
called the block size. The BLOCK_SIZE
attribute lets you set the block size for any column.
The block size attribute is a relatively advanced feature. This is an unsupported feature and is considered experimental.
Kudu Replication Factor
TBLPROPERTIES
in the
CREATE TABLE
statement as below where
n is the replication factor you want to use:
TBLPROPERTIES ('kudu.num_tablet_replicas' = 'n')
The number of replicas for a Kudu table must be odd.
Altering the kudu.num_tablet_replicas
property after
table creation currently has no effect.
How Impala Handles Kudu Metadata
Much of the metadata for Kudu tables is handled by the underlying storage layer. Kudu tables have less reliance on the Metastore database, and require less metadata caching on the Impala side. For example, information about partitions in Kudu tables is managed by Kudu, and Impala does not cache any block locality metadata for Kudu tables.
The REFRESH
and INVALIDATE
METADATA
statements are needed less frequently for Kudu
tables than for HDFS-backed tables. Neither statement is needed when
data is added to, removed, or updated in a Kudu table, even if the
changes are made directly to Kudu through a client program using the
Kudu API.
Run REFRESH table_name
or INVALIDATE METADATA table_name
for a Kudu table only after making a change to the Kudu table schema,
such as adding or dropping a column.
Because Kudu manages the metadata for its own tables separately from
the Metastore database, there is a table name stored in the Metastore
database for Impala to use, and a table name on the Kudu side, and these
names can be modified independently through ALTER TABLE
statements.
To avoid potential name conflicts, the prefix
impala::
and the Impala database name are encoded
into the underlying Kudu table name.