Using anti joins

Unified Analytics supports anti join queries to find the rows in a table that do not match rows in another table. The performance of queries involving NULL filters, "not exists" clauses, and "not in" clauses" are improved using the anti join optimization.

In this task, run an anti join query. For example:
SELECT * FROM left_table t1 LEFT JOIN right_table t2 ON t1.id = t2.id WHERE t2.id IS NULL; 
  1. In the Data Warehouse service, click Overview in the left navigation pane.
  2. In Overview, i Virtual Warehouses, select your Virtual Warehouse, click , and select Edit.
  3. In Configurations, select hive-site, add the hive.auto.convert.anti.join property, and set it to true.
  4. Open Hue, and create two tables, one having more rows than the other.
    create table tt1 (order_num bigint, price decimal(5, 2));
    
    create table tt2 (order_num bigint);
  5. Insert several, say five, rows of values (order number, price) in the first table and fewer, say 2, rows of values (order number only) in the second table.
    insert into tt1 values (42, 300.50), (141, 250.25), (150, 85.00), (150, 500.44), (166, 125.65);
    
    insert into tt2 values (150), (166);
  6. Take a look at the content of the first table.
    select * from tt1;
    tt1.order_num	tt1.price
    42	            300.50
    141	            250.25
    150	            85.00
    150	            500.44
    166	            125.65
  7. Take a look at the content of the second table.
    select * from tt2;
    tt2.order_num
    150
    166
  8. Perform an equality left join on the order numbers in the tables.
    select * from tt1 t1 LEFT JOIN tt2 t2 ON t1.order_num=t2.order_num;
                    
    t1.order_num	t1.price	t2.order_num
    42	          300.50	   NULL
    141	         250.25	   NULL
    150	         85.00	    150
    150	         500.44	   150
    166	         125.65	   166
  9. Add a null filter to the join query to anti join the tables.
    select * from tt1 t1 LEFT JOIN tt2 t2 ON t1.order_num=t2.order_num where t2.order_num is NULL;
    The anti join returns the rows having NULL in the price column.
    t1.order_num	t1.price	t2.order_num
    42	          300.50	  NULL
    141	         250.25	  NULL