Configuring Per-Bucket Settings
You can specify bucket-specific configuration values which override the common configuration values.
This allows for:
Different authentication mechanisms and credentials on different buckets
Different encryption policies on different buckets
Different S3 endpoints to use for different buckets. This is essential when working with S3 regions which only support the "V4 authentication API", in case of which callers must always declare the explicit region
All fs.s3a
options other than a small set of unmodifiable values (currently
fs.s3a.impl
) can be set on a per-bucket basis.
To set a bucket-specific option:
Add a new configuration, replacing the
fs.s3a.
prefix on an option withfs.s3a.bucket.BUCKETNAME.
, whereBUCKETNAME
is the name of the bucket.For example, if you are configuring access key for a bucket called "nightly", instead of using
fs.s3a.access.key
property name, usefs.s3a.bucket.nightly.access.key
.When connecting to a bucket, all options explicitly set for that bucket will override the base
fs.s3a.
values, but they will not be picked up by other buckets.
Example
You may have a base configuration to use the IAM role information available when deployed in Amazon EC2:
<property> <name>fs.s3a.aws.credentials.provider</name> <value>org.apache.hadoop.fs.s3a.SharedInstanceProfileCredentialsProvider</value> </property>
This will be the default authentication mechanism for S3A buckets.
A bucket s3a://nightly/
used for nightly data uses a session key, so its
bucket-specific configuration
is:
<property> <name>fs.s3a.bucket.nightly.access.key</name> <value>AKAACCES-SKEY-2</value> </property> <property> <name>fs.s3a.bucket.nightly.secret.key</name> <value>SESSION-SECRET-KEY</value> </property> <property> <name>fs.s3a.bucket.nightly.session.token</name> <value>SHORT-LIVED-SESSION-TOKEN</value> </property> <property> <name>fs.s3a.bucket.nightly.aws.credentials.provider</name> <value>org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider</value> </property>
Finally, the public s3a://landsat-pds/
bucket could be accessed
anonymously, so its bucket-specific configuration is:
<property> <name>fs.s3a.bucket.landsat-pds.aws.credentials.provider</name> <value>org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider</value> </property>
For all other buckets, the base configuration is used.
Related Links
Configuring Per-Bucket Settings to Access Data Around the World