Cloud Data Access
Also available as:
PDF
loading table of contents...

Errors Related to Visible S3A Inconsistency

S3 is an eventually consistent object store. That is, it is not a filesystem. It offers read-after-create consistency, which means that a newly created file is immediately visible. Except, there is a small quirk: a negative GET may be cached, such that even if an object is immediately created, the fact that there "wasn't" an object is still remembered.

That means the following sequence on its own will be consistent:

touch(path) -> getFileStatus(path)

But this sequence may be inconsistent:

getFileStatus(path) -> touch(path) -> getFileStatus(path)

A common source of visible inconsistencies is that the S3 metadata database — the part of S3 which serves list requests — is updated asynchronously. Newly added or deleted files may not be visible in the index, even though direct operations on the object (HEAD and GET) succeed.

In S3A, that means that the getFileStatus() and open() operations are more likely to be consistent with the state of the object store than any directory list operations (listStatus(), listFiles(), listLocatedStatus(), listStatusIterator()).

The following errors may be related to eventual consistency of S3.

FileNotFoundException, Even Though the File Was Just Written

This can be a sign of consistency problems. It may also surface if there is some asynchronous file write operation still in progress in the client: the operation has returned, but the write has not yet completed. While the S3A client code does block during the close() operation, we suspect that asynchronous writes may be taking place somewhere in the stack; This could explain why parallel tests fail more often than serialized tests.

File Not Found in a Directory Listing, Even Though getFileStatus() Finds It

File was not found in a directory listing, even though getFileStatus() finds it — or a deleted file is found in listing, even though getFileStatus() reports that it is not there.

This is a visible sign of updates to the metadata server, which is lagging behind the state of the underlying filesystem.

File Not Visible/Saved

The files in an object store are not visible until the write has been completed. In-progress writes are simply saved to a local file/cached in RAM and only uploaded at the end of a write operation. If a process terminated unexpectedly, or failed to call the close() method on an output stream, the pending data will have been lost.

File flush() and hflush() Calls Do Not Save Data to S3A

Again, this is due to the fact that the data is cached locally until the close() operation. The S3A filesystem cannot be used as a store of data if it is required that the data is persisted durably after everyflush()/hflush() call. This includes resilient logging, HBase-style journaling and the like. The standard strategy here is to save to HDFS and then copy to S3.