Job cleanup

Job cleanup is designed to address a number of issues which may surface in cloud storage:

  • Slow performance for deletion of directories.
  • Timeout when deleting very deep and wide directory trees.
  • General resilience to cleanup issues escalating to job failures.
Option Meaning Default Value
mapreduce.fileoutputcommitter.cleanup.skipped Skip cleanup of _temporary directory false
mapreduce.fileoutputcommitter.cleanup-failures.ignored Ignore errors during cleanup false
mapreduce.manifest.committer.cleanup.parallel.delete Delete task attempt directories in parallel true

The algorithm is:


if `mapreduce.fileoutputcommitter.cleanup.skipped`:
  return
if `mapreduce.manifest.committer.cleanup.parallel.delete`:
  attempt parallel delete of task directories; catch any exception
if not `mapreduce.fileoutputcommitter.cleanup.skipped`:
  delete(`_temporary`); catch any exception
if caught-exception and not `mapreduce.fileoutputcommitter.cleanup-failures.ignored`:
  throw caught-exception

The goal is to perform a fast/scalable delete and throw a meaningful exception if that didn't work.

When working with ABFS and GCS, these settings should normally be left alone. If somehow errors surface during cleanup, enabling the option to ignore failures will ensure the job still completes. Disabling cleanup even avoids the overhead of cleanup, but requires a workflow or manual operation to clean up all _temporary directories on a regular basis.