Druid and Hive tuning

As administrator, you can set druid.hive properties to improve Druid-Hive performance.

Performance related druid.hive properties

If Hive and Druid are installed with Ambari, the properties are set and tuned for your cluster automatically. However, you can fine-tune some properties if you detect performance problems with applications that are running the queries. The following list includes some of the Druid properties that can be used by Hive. As an HDP administrator, you can troubleshoot and customize a Hive-Druid integration using these properties.


Property	Description
hive.druid.indexer.segments.granularity	Granularity of the segments created by the Druid storage handler.
hive.druid.indexer.partition.size.max	Maximum number of records per segment partition.
hive.druid.indexer.memory.rownum.max	Maximum number of records in memory while storing data in Druid.
hive.druid.broker.address.default	Address of the Druid broker node. When Hive queries Druid, this address must be declared.
hive.druid.coordinator.address.default	Address of the Druid coordinator node. It is used to check the load status of newly created segments.
hive.druid.select.threshold	When a SELECT query is split, this is the maximum number of rows that Druid attempts to retrieve.
hive.druid.http.numConnection	Number of connections used by the HTTP client.
hive.druid.http.read.timeout	Read timeout period for the HTTP client in ISO8601 format. For example, `P2W, P3M, PT1H30M, PT0.750S` are possible values.
hive.druid.sleep.time	Sleep time between retries in ISO8601 format.
hive.druid.basePersistDirectory	Local temporary directory used to persist intermediate indexing state.
hive.druid.storage.storageDirectory	Deep storage location of Druid.
hive.druid.metadata.base	Default prefix for metadata table names.
hive.druid.metadata.db.type	Metadata database type. The only valid values are "mysql" and "postgresql"
hive.druid.metadata.uri	URI to connect to the database.
hive.druid.working.directory	Default HDFS working directory used to store some intermediate metadata.
hive.druid.maxTries	Maximum number of retries to connect to Druid before throwing an exception.
hive.druid.bitmap.type	Encoding algorithm use to encode the bitmaps.

If you installed both Hive and Druid with Ambari, then do not change any of the hive.druid.* properties other than those above when there are performance issues.