Ways to clean up old queries from the Query Processor tables
Learn how to schedule a query clean-up and how to use the API to manually clean up queries from the following Query Processor tables: vertex_info, dag_details, dag_info, query_details, hive_query, tez_app_info.
Scheduling query clean-up
- hue.query-processor.event-pipeline.cleanup.cron.expression
- hue.query-processor.event-pipeline.cleanup-interval-secs
“hue.query-processor.event-pipeline.cleanup.cron.expression” : “0 0 2 * * ?”
“hue.query-processor.event-pipeline.cleanup-interval-secs” : “2592000”
Manually cleaning up queries using an API
The ability to clean up queries manually in addition to the scheduled clean-up routines is useful when you have a high load of queries in a particular week that are hogging resources that you must free up. The API also runs a VACUUM command on the Query Processor table to reclaim storage that is occupied by dead tuples.
You can send an API request using tools such as cURL or Postman.
API format: [***QUERY-PROCESSOR-ADDRESS***]/api/admin/cleanup/[***EPOCH-TIME***]
- [***QUERY-PROCESSOR-ADDRESS***] is the query processor host address
- [***EPOCH-TIME***] is the Unix epoch time in seconds
Queries that were run before the specified epoch time are purged.
curl "http://machine1.company.com:30700/api/admin/cleanup/1670006742"