Reads (scans)
On a leader change, READ_AT_SNAPSHOT
scans at a snapshot whose timestamp is beyond the last write, may yield non-repeatable reads
(see KUDU-1188).
Recommendation
If repeatable snapshot reads are a requirement, use READ_AT_SNAPSHOT
with a timestamp that is slightly in the past (between 2-5
seconds, ideally). This will circumvent the anomaly described above. Even when the anomaly has
been addressed, back-dating the timestamp will always make scans faster, since they are
unlikely to block.
Impala scans are currently performed as READ_LATEST
and have no consistency guarantees.
In AUTO_BACKGROUND_FLUSH
mode, or when using "async" flushing mechanisms, writes applied to a single client session may
get reordered due to the concurrency of flushing the data to the server. This is particularly
noticeable if a single row is quickly updated with different values in succession. This
phenomenon affects all client API implementations. Workarounds are described in the respective
API documentation for FlushMode
or AsyncKuduSession
. See KUDU-1767.