Using Apache Hive
Also available as:
PDF

Create a materialized view and store it in Druid

You can create a materialized view and store it in an external system, such as Druid, which supports JSON queries, very efficient timeseries queries, and groupBy queries.

  • Hive is running as a service in the cluster.
  • Druid is running as a service in the cluster.
  • You created a transactional table named src that has timestamp, dimension, and metric columns: __time TIMESTAMP, page STRING, user STRING, c_added INT, and c_removed INT columns.

In this task, you include the STORED BY clause followed by the Druid storage handler. The storage handler integrates Hive and Druid for saving the materialized view in Druid.

  1. Execute a Hive query to set the location of the Druid broker using a DNS name or IP address and port 8082, the default broker text listening port.
    SET hive.druid.broker.address.default=10.10.20.30:8082;
  2. Create a materialized view store the view in Druid.
    CREATE MATERIALIZED VIEW druid_mv
    STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
    AS SELECT __time, page, user, c_added, c_removed
    FROM src;