Scheduling tasks to clean up Cloudera Data Explorer (Hue) tmp directory
Data Explorer writes temporary files to a temporary block
space, also used by other services such as Hive and Impala. If the tmp space is full, the
Data Explorer pod gets evicted disrupting operational continuity.
Set up a cleanup threshold in the Data Explorer task server to prevent
pod eviction.
Files older than 60 minutes are cleaned up from the /tmp
directory. This is a configurable parameter. The default cleanup threshold is 90%.
Log in to the Data Explorer web interface as a
superuser.
Click Administer Server from the left assist panel and
then click the Task Server tab.
Click Schedule Task.
On the Schedule Task modal, select tmp clean
up from the dropdown menu.
In the threshold for clean up field, specify the
threshold value in percentage.
The cleanup job is triggered when the disk space reaches the specified
threshold value. For example, if you specify 70, the temporary disk space is
cleaned up when it is 70% full.
Click Submit.
You can monitor the logs by clicking the job ID. By
default, only INFO-level logs are displayed. You can change this to DEBUG-level from the
Hue Advanced Configuration Snippet for troubleshooting purposes.