Hive query history service
The query history service in Hive provides a scalable solution for storing and querying historical query information in a structured and performant manner, enabling long-term analysis and monitoring.
Cloudera Data Warehouse provides you the option to enable logging Hive queries on an existing Virtual Warehouse or while creating a new Hive Virtual Warehouse.The query history service in Hive is a feature that stores a long-term record of finished queries and their associated metrics. It is designed to support auditing, debugging, and performance monitoring at scale by persisting historical query data in a modern table format.
- Hive history
.txt
files. - Protobuf logging hook.
- Live queries on the HiveServer2 Web UI.
SHOW PROCESSLIST
command.- In-development query history service.
While these options allow inspection of active or recent queries, none provide a scalable solution for storing and querying historical query information in a structured and performant manner.
The query history service addresses this limitation by persisting structured records for completed queries, enabling long-term analysis using standard SQL.
Purpose of the query history table
- Submitting user
- Query runtime
- Tables accessed
- Errors
- Additional metadata fields
This information is stored in a structured format using the Iceberg table format, allowing efficient querying and future integration with tools such as Apache Hue or custom dashboards.
Scope and Limitations
The query history service runs as part of HiveServer2 and writes query data to an Iceberg table in batches, using configurable memory buffering and flushing strategies. However, it is not intended for real-time query inspection, query debugging or recommendations, or for creating user interfaces or visualizations.