Upserting a row

The UPSERT command acts as a combination of the INSERT and UPDATE statements.

For each row processed by the UPSERT statement:
  • If another row already exists with the same set of primary key values, the other columns are updated to match the values from the row being 'UPSERTed'.
  • If there is no row with the same set of primary key values, the row is created, the same as if the INSERT statement was used.

UPSERT Example

The following example demonstrates how the UPSERT statement works. We start by creating two tables, foo1 and foo2.
CREATE TABLE foo1 (
 id INT PRIMARY KEY,
 col1 STRING, 
 col2 STRING
) 
PARTITION BY HASH(id) PARTITIONS 3 
STORED AS KUDU;
CREATE TABLE foo2 (
 id INT PRIMARY KEY,
 col1 STRING, 
 col2 STRING
) 
PARTITION BY HASH(id) PARTITIONS 3 
STORED AS KUDU;
Populate foo1 and foo2 using the following INSERT statements. For foo2, we leave column col2 with NULL values to be upserted later:
INSERT INTO foo1 VALUES (1, "hi", "alice");
INSERT INTO foo2 select id, col1, NULL from foo1;
The contents of foo2 will be:
SELECT * FROM foo2;
...
+----+------+------+
| id | col1 | col2 |
+----+------+------+
| 1  | hi   | NULL |
+----+------+------+
Fetched 1 row(s) in 0.15s
Now use the UPSERT command to now replace the NULL values in foo2 with the actual values from foo1.
UPSERT INTO foo2 (id, col2) select id, col2 from foo1;
SELECT * FROM foo2;
...
+----+------+-------+
| id | col1 | col2  |
+----+------+-------+
| 1  | hi   | alice |
+----+------+-------+
Fetched 1 row(s) in 0.15s