Cleaning up old data to improve performance
Some tables in Hue retain data indefinitely resulting in slower performance or application crash. Hue does not automatically clean up data from these tables. You can configure Hue to retain the data for a specific number of days and then schedule a cron job to clean up these tables at regular intervals for improved performance.
- Upgrade times out
- Performance is slower than expected
- Long time to log in to Hue
- SQL query shows a large number of documents in tables
- Hue crashes while trying to access saved documents
select count(*) from desktop_document; select count(*) from desktop_document2; select count(*) from beeswax_session; select count(*) from beeswax_savedquery; select count(*) from beeswax_queryhistory; select count(*) from oozie_job;
- SSH in to an active Hue instance.
Change to the Hue home directory:
Run the following command as the root user:
DESKTOP_DEBUG=True ./build/env/bin/hue desktop_document_cleanup --keep-days x --cm-managedThe
--keep-daysproperty is used to specify the number of days for which Hue will retain the data in the backend database.
DESKTOP_DEBUG=True ./build/env/bin/hue desktop_document_cleanup --keep-days 30 --cm-managedIn this case, Hue will retain data for 30 days.The logs are displayed on the console because
DESKTOP_DEBUGis set to
True. Alternatively, you can view the logs from the following location:/var/log/hue/desktop_document_cleanup.logThe first run can typically take around 1 minute per 1000 entries in each table.
Check whether the table size has decreased by running a query as follows:
select count(*) from desktop_document;If the
desktop_document_cleanupcommand has run successfully, the table size should decrease.