Enabling Cost-Based SQL Optimization

Hortonworks recommends that administrators always enable CBO. Set and verify the following configuration parameters in the hive-site.xmlfile to enable cost-based optimization of SQL queries:

Enables cost-based query optimization.



Enables automated gathering of table-level statistics for newly created tables and table partitions, such as tables created with the INSERT OVERWRITE statement. The parameter does not produce column-level statistics, such as those generated by CBO. If disabled, administrators must manually generate these table-level statistics with the ANALYZE TABLE statement.


The following configuration properties are not specific to CBO, but setting them to true will also improve the performance of queries that generate statistics:

hive.stats.fetch. column.stats

Instructs Hive to collect column-level statistics.


hive.compute.query. using.stats

Instructs Hive to use statistics when generating query plans.



