Statistics
Tip | |
---|---|
Gather both column and table statistics for best query performance. |
Column and table statistics must be calculated for optimal Hive performance because they are critical for estimating predicate selectivity and cost of the plan. In the absence of table statistics, Hive CBO does not function. Certain advanced rewrites require column statistics.
Ensure that the configuration properties in the following table are set to true
to
improve the performance of queries that generate statistics. You can set the properties using
Ambari or by customizing the hive-site.xml
file.
Configuration Parameter | Setting to Enable Statistics | Description |
---|---|---|
| true |
Instructs Hive to collect column-level statistics. |
| true |
Instructs Hive to use statistics when generating query plans. |