Data Access
Also available as:
PDF
loading table of contents...

Generating Statistics

Use the ANALYZE TABLE command to generate statistics for tables and columns. Use the optional NoScan clause to improve query performance by preventing a scan of files on HDFS. This option gathers only the following statistics:

  • Number of files

  • Size of files in bytes

ANALYZE TABLE tablename [PARTITION(partcol1[=val1], partcol2[=val2], ...)] COMPUTE STATISTICS [NoScan];

The following example views statistics for all partitions in the employees table. The query also uses the NoScan clause to improve performance:

ANALYZE TABLE employees PARTITION (dt) COMPUTE STATISTICS [NoScan];

Generating Column-level Statistics:

Use the following syntax to generate statistics for columns in the employees table:

ANALYZE TABLE tablename [PARTITION(partcol1[1=val1], partcol2[=val2], ...)] COMPUTE STATISTICS FOR COLUMNS [NoScan]; 

The following example generates statistics for all columns in the employees table:

ANALYZE TABLE employees PARTITION (dt) COMPUTE STATISTICS FOR COLUMNS; 
[Tip]Tip

See Using the Cost-Based Optimizer for Optimal Performance for more information and recommended settings.