Scheduling tasks to clean up Cloudera Data Explorer (Hue) documents

If the task server is enabled, then you can clean up data from the backend Data Explorer tables from the Data Explorer web interface. You no longer need to run the document cleanup shell commands or queries for this purpose.

  1. Log in to the Data Explorer web interface as a superuser.
  2. Click Administer Server from the left assist panel and then click the Task Server tab.
  3. Click Schedule Task.
  4. On the Schedule Task modal, select document cleanup from the dropdown menu.
    In the keep-days field, specify the number of days for which you want to retain the data in the Data Explorer tables. For example, if you specify 30, the tables are cleaned up every 30 days.
  5. Click Submit.
You can monitor the logs by clicking the job ID. By default, only INFO-level logs are displayed. You can change this to DEBUG-level from the Hue Advanced Configuration Snippet for troubleshooting purposes.