Druid and Hive tuning
As administrator, you can set druid.hive
properties to improve
Druid-Hive performance.
Performance related druid.hive properties
If Hive and Druid are installed with Ambari, the properties are set and tuned for your cluster automatically. However, you can fine-tune some properties if you detect performance problems with applications that are running the queries. The following list includes some of the Druid properties that can be used by Hive. As an HDP administrator, you can troubleshoot and customize a Hive-Druid integration using these properties.
Property | Description |
---|---|
hive.druid.indexer.segments.granularity | Granularity of the segments created by the Druid storage handler. |
hive.druid.indexer.partition.size.max | Maximum number of records per segment partition. |
hive.druid.indexer.memory.rownum.max | Maximum number of records in memory while storing data in Druid. |
hive.druid.broker.address.default | Address of the Druid broker node. When Hive queries Druid, this address must be declared. |
hive.druid.coordinator.address.default | Address of the Druid coordinator node. It is used to check the load status of newly created segments. |
hive.druid.select.threshold | When a SELECT query is split, this is the maximum number of rows that Druid attempts to retrieve. |
hive.druid.http.numConnection | Number of connections used by the HTTP client. |
hive.druid.http.read.timeout | Read timeout period for the HTTP client in ISO8601 format. For example,
P2W, P3M, PT1H30M, PT0.750S are possible
values. |
hive.druid.sleep.time | Sleep time between retries in ISO8601 format. |
hive.druid.basePersistDirectory | Local temporary directory used to persist intermediate indexing state. |
hive.druid.storage.storageDirectory | Deep storage location of Druid. |
hive.druid.metadata.base | Default prefix for metadata table names. |
hive.druid.metadata.db.type | Metadata database type. The only valid values are "mysql" and "postgresql" |
hive.druid.metadata.uri | URI to connect to the database. |
hive.druid.working.directory | Default HDFS working directory used to store some intermediate metadata. |
hive.druid.maxTries | Maximum number of retries to connect to Druid before throwing an exception. |
hive.druid.bitmap.type | Encoding algorithm use to encode the bitmaps. |
If you installed both Hive and Druid with Ambari, then do not change any of the
hive.druid.*
properties other than those above when there are performance
issues.