Unified Analytics supports anti join queries to find the rows in a table that do not
match rows in another table. The performance of queries involving NULL filters, "not exists"
clauses, and "not in" clauses" are improved using the anti join optimization.
In this task, run an anti join query. For example:
SELECT * FROM left_table t1 LEFT JOIN right_table t2 ON t1.id = t2.id WHERE t2.id IS NULL;
-
In the Data Warehouse service, click Overview in the
left navigation pane.
-
In Overview, i Virtual Warehouses, select your
Virtual Warehouse, click , and select Edit.
-
In Configurations, select hive-site, add the
hive.auto.convert.anti.join property, and set it to true.
-
Open Hue, and create two tables, one having more rows than the other.
create table tt1 (order_num bigint, price decimal(5, 2));
create table tt2 (order_num bigint);
-
Insert several, say five, rows of values (order number, price) in the first
table and fewer, say 2, rows of values (order number only) in the second
table.
insert into tt1 values (42, 300.50), (141, 250.25), (150, 85.00), (150, 500.44), (166, 125.65);
insert into tt2 values (150), (166);
-
Take a look at the content of the first table.
tt1.order_num tt1.price
42 300.50
141 250.25
150 85.00
150 500.44
166 125.65
-
Take a look at the content of the second table.
-
Perform an equality left join on the order numbers in the tables.
select * from tt1 t1 LEFT JOIN tt2 t2 ON t1.order_num=t2.order_num;
t1.order_num t1.price t2.order_num
42 300.50 NULL
141 250.25 NULL
150 85.00 150
150 500.44 150
166 125.65 166
-
Add a null filter to the join query to anti join the tables.
select * from tt1 t1 LEFT JOIN tt2 t2 ON t1.order_num=t2.order_num where t2.order_num is NULL;
The anti join returns the rows having NULL in the price
column.
t1.order_num t1.price t2.order_num
42 300.50 NULL
141 250.25 NULL