Impala workload management table maintenance

Understand the maintenance requirements for the sys.impala_query_live and sys.impala_query_log tables.

For efficient query performance, different maintenance needs apply to the sys.impala_query_live and sys.impala_query_log tables.
  • Sys.impala_query_live:
    • No maintenance is required because it resides entirely in memory.
  • Sys.impala_query_log:
    • As an Iceberg table, it requires periodic maintenance, such as:
      • Computing statistics.
      • Optimizing the table structure.
      • Performing snapshot expiration or cleanup.

Since Impala workloads are unique, no automatic maintenance is performed on the sys.impala_query_log table. You should schedule maintenance tasks according to your workload needs.

To optimize the Impala query log, run the query OPTIMIZE TABLE sys.impala_query_log (FILE_SIZE_THRESHOLD_MB=128). Cloudera recommends testing this query in the development or test environments to evaluate its impact on your workload. For best results, run the query during low cluster activity times.