Using S3 as a Safe and Fast Destination of Work
Amazon S3 is an eventually consistent fileystem, which makes listings unreliable. It also
lacks a rename()
operation which makes the performance of committing work very
slow. To address these issues, the S3A connector has two features
S3Guard: for consistent directory listings.
S3A Committers: For high-performance committing of the output of Spark queries to S3.
Without these, it is trying to use S3 destination of work is slow and potentially unsafe.