Accessing Cloud Data
Also available as:
PDF
loading table of contents...

Using Unique Filenames to Avoid File Update Inconsistency

When updating existing datasets, if a new file overwrites an existing file of the same name, S3's eventual consistency on file updates means that the old data may still be returned.

To avoid this, unique filenames should be used when creating files.

The property fs.s3a.committer.staging.unique-filenames enables this.

<property>
  <name>fs.s3a.committer.staging.unique-filename</name>
  <value>true</value>
</property>

It is set to to true by default; you only need to disable it if you explictly want newly created files to have the same name as any predecessors.