Multi-row transactions for querying Kudu tables

When you use Impala to query Kudu tables, you can insert multiple rows into a Kudu table in a single transaction. This broader transactional support between Kudu and Impala is available to you at a query level and at a session level.

Using multi-row transaction capability

To enable this multi-row transaction capability, you must create a mapping between the Impala and Kudu tables if you want to use Impala to query Kudu tables. Kudu provides the Impala query to map to an existing Kudu table in the web UI. See the links provided under Related Information for mapping information.

Using this multi-row transaction capability, you will benefit from having broader transactional support in Kudu and Impala. You can control this multi-row transaction feature by using the following query option. You may set this option at per-query or per-session level.

set ENABLE_KUDU_TRANSACTION=true

The following example shows how to insert three rows into a table in a single transaction.

Example:

  1. Create table kudu-test-tbl-1.
    create table kudu-test-tbl-1 (a int primary key, b string) partition by hash(a) partitions 8 stored as kudu;
  2. Enable the multi-row transaction feature at the query level.
    set ENABLE_KUDU_TRANSACTION=true;
  3. Insert three rows into the newly created table in a single transaction.
    insert into {0} values (0, 'a'), (1, 'b'), (2, 'c');
  4. Verify the number of rows of this table.
    select count(*) from kudu-test-tbl-1;

When you enable this option, each impala statement is executed as part of a new transaction. If the statement is executed successfully then the Impala Coordinator commits the transaction. If there is an error returned by Kudu then Impala aborts the transaction.

This applies to the following statements:

  • INSERT
  • CREATE TABLE AS SELECT

Advantages of using this capability

You can now easily build and manage Kudu applications, especially when Impala is used to interact with the data in the Kudu table. This capability provides the ability to:

  • Atomically do a bulk ingest.
  • Automatically insert rows into multiple tables.