Accessing Apache Hive Table Statistics in CDH
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
Statistics for Hive can be numbers of rows of tables or partitions and the histograms of interesting columns. Statistics are used by the cost functions of the query optimizer to generate query plans for the purpose of query optimization.
analyze table <table name> compute statistics; analyze table <table name> compute statistics for columns <all columns of a table>;