Accessing data using Apache Druid

Also available as:

PDF

loading table of contents...

Setting up and using Apache Druid

After learning hardware recommendations and software requirements, you add the Apache Druid (incubating) service to an HDP 3.x cluster.

Recommendations:

Assign the Overload, Coordinator, and Router to one or more master nodes of size AWS m3.xlarge or equivalent: 4 vCPUs, 15 GB RAM, 80 GB SSD storage
Co-locate the Historical and MiddleManager on different nodes from the Overload, Coordinator, Router, and Broker, and on nodes of size AWS r3.2Xlarge or equivalent: 8 vCPUs, 61 GB RAM, 160 GB SSD storage.
Do not co-locate LLAP daemons and Historical components.

Software Requirements:

A MySQL or Postgres database for storing metadata in a cluster for a production
You can use the default Derby database installed and configured by Ambari if you are using a single-node cluster for development.
Ambari 2.7.0 or later
Database connector set up in Ambari
HDP 3.0 or later
ZooKeeper
HDFS or Amazon S3
YARN and MapReduce2