Apache Impala Reference
Performance Considerations
Performance Best Practices
Query Join Performance
Table and Column Statistics
Generating Table and Column Statistics
Runtime Filtering
Partitioning
Partition Pruning for Queries
HDFS Caching
HDFS Block Skew
Understanding Performance using EXPLAIN Plan
Understanding Performance using SUMMARY Report
Understanding Performance using Query Profile
Scalability Considerations
Scaling Limits and Guidelines
Dedicated Coordinator
Hadoop File Formats Support
Using Text Data Files
Using Parquet Data Files
Using ORC Data Files
Using Avro Data Files
Using RCFile Data Files
Using SequenceFile Data Files
Storage Systems Supports
Impala with HDFS
Impala with Kudu
Configuring for Kudu Tables
Impala DDL for Kudu
Partitioning for Kudu Tables
Impala DML for Kudu Tables
Impala with HBase
Impala with Azure Data Lake Store (ADLS)
Impala with Amazon S3
Specifying Impala Credentials to Access S3
Ports Used by Impala
Migration Guide
Modifying Impala Startup Options
Setting up Data Cache for Remote Reads
Managing Metadata in Impala
On-demand Metadata
Transactions