Cloud Data Access
Also available as:
PDF
loading table of contents...

Local Space Requirements for Copying to S3

When copying files to S3 using the S3A connector, DistCp copies each file to the local temp directory before the final upload, so you need as much space on your disk as your largest file. The location of this intermediate directory is set in the property fs.s3a.buffer.dir; if needed, you can change that to a location where you have more space.

When working with S3, you reduce the amount of disk space needed by switching to the S3A fast upload mechanism, which only needs enough disk space to store blocks of data which have not yet been uploaded, or even do it in memory. You can limit the requirements even further by reducing the thread pool size.