CSE-KMS: Amazon S3-KMS managed encryption keys
Amazon S3 Client Side Encryption (S3-CSE) is used to encrypt data on the client side and then transmit it to S3 storage. The same encrypted data is transmitted to the client while reading and then decrypted on the client side.
Introduction
S3-CSE uses AmazonS3EncryptionClientV2.java
as the Amazon S3 client. The
encryption and decryption is done by AWS SDK. Currently only the CSE-KMS method of
client-side encryption is supported.
Previously, client-side encryption was unavailable due to the AWS S3 client padding uploaded objects with a 16 byte footer. This meant that files were shorter when being read than when are listed them through any of the list API calls/getFileStatus(). This broke many applications, including anything seeking near the end of a file to read a footer, as ORC and Parquet do.
There is now a workaround: compensate for the footer in listings when CSE is enabled.
- When listing files and directories, 16 bytes are subtracted from the length of all non-empty objects (greater than or equal to 16 bytes).
- Directory markers MAY be longer than 0 bytes long.
The length of files when listed through the S3A client is now going to be shorter than the length of files listed with other clients--including S3A clients where S3-CSE has not been enabled.
Features
- Supports client-side encryption with keys managed in AWS KMS.
- Encryption settings propagated into jobs through any issued delegation tokens.
- Encryption information stored as headers in the uploaded object.
Limitations
- Performance will be reduced. All encrypt/decrypt is now being done on the client.
- Writing files may be slower, as only a single block can be encrypted and uploaded at a time.
- Multipart Uploader API is disabled.
- S3 Select is not supported.
- Multipart uploads would be serial, and partSize must be a multiple of 16 bytes.
- Maximum message size in bytes that can be encrypted under this mode is 2^36-32, or ~64G, due to the security limitation of AES/GCM as recommended by NIST.