Using Apache HivePDF version

Creating the tables and view

You see how to create simple tables, insert the data, and join the tables using a materialized view. You run the query, and the optimizer takes advantage of the precomputation performed by the materialized view to speed response time.

  1. Create two ACID tables:
    CREATE TABLE emps (
    empid INT,
    deptno INT,
    name VARCHAR(256),
    salary FLOAT,
    hire_date TIMESTAMP);
                            
    CREATE TABLE depts (
    deptno INT,
    deptname VARCHAR(256),
    locationid INT);
  2. Insert some data into the tables for example purposes:
    INSERT INTO TABLE emps VALUES (10001,101,'jane doe',250000,'2018-01-10');
    INSERT INTO TABLE emps VALUES (10002,100,'somporn klailee',210000,'2017-12-25');
    INSERT INTO TABLE emps VALUES (10003,200,'jeiranan thongnopneua',175000,'2018-05-05');
                            
    INSERT INTO TABLE depts VALUES (100,'HR',10);
    INSERT INTO TABLE depts VALUES (101,'Eng',11);
    INSERT INTO TABLE depts VALUES (200,'Sup',20);               
    Tables must be ACID (managed) tables.
  3. Create a materialized view to join the tables:
    CREATE MATERIALIZED VIEW mv1
    AS SELECT empid, deptname, hire_date
    FROM emps JOIN depts
    ON (emps.deptno = depts.deptno)
    WHERE hire_date >= '2017-01-01';
  4. Run a query that takes advantage of the precomputation performed by the materialized view:
    SELECT empid, deptname
    FROM emps
    JOIN depts
    ON (emps.deptno = depts.deptno)
    WHERE hire_date >= '2017-01-01'
    AND hire_date <= '2019-01-01';
    Output is:
    +--------+-----------+
    | empid  | deptname  |
    +--------+-----------+
    | 10003  | Sup       |
    | 10002  | HR        |
    | 10001  | Eng       |
    +--------+-----------+