Set up the cost-based optimizer and statistics
You can use the cost-based optimizer (CBO) and statistics to develop efficient query execution plans that can improve performance. You must generate column statistics to make CBO functional.
In this task, you enable and configure the cost-based optimizer (CBO) and configure Hive to gather column statistics as well as table statistics for evaluating query performance. Column and table statistics are critical for estimating predicate selectivity and the cost of the plan. Certain advanced rewrites require column statistics.
In this task, you check, and set the following properties:
Controls collection of table-level statistics.
Controls collection of column-level statistics.
Instructs Hive to use statistics when generating query plans.
You can manually generate the table-level statistics for newly created tables and table partitions using the ANALYZE TABLE statement.
- The following components are running:
- Hive Metastore
- Hive clients
- Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
- In select the Hive service, for example, HIVE_ON_TEZ-1.
On the Configuration tab, search for
Search for and enable
- In select Restart from the options menu.