Local Space Requirements for Copying to S3
When copying files to S3 using the S3A connector, DistCp copies each file to the local
temp directory before the final upload, so you need as much space on your disk as your
largest file. The location of this intermediate directory is set in the property
fs.s3a.buffer.dir
; if needed, you can change that to a location
where you have more space.
When working with S3, you reduce the amount of disk space needed by switching to the S3A fast upload mechanism, which only needs enough disk space to store blocks of data which have not yet been uploaded, or even do it in memory. You can limit the requirements even further by reducing the thread pool size.