Configuring Per-Bucket Settings

You can specify bucket-specific configuration values which override the common configuration values.

  1. Authentication mechanisms and credentials
  2. Encryption settings
  3. The specific S3 endpoints to send HTTP requests to. This is essential when working with S3 regions which only support the "V4 authentication API", in case of which callers must always declare the explicit region.

All fs.s3a options other than a small set of unmodifiable values (currently fs.s3a.impl) can be set on a per-bucket basis.

To set a bucket-specific option:
  1. Add a new configuration, replacing the fs.s3a. prefix on an option with fs.s3a.bucket.BUCKETNAME, where BUCKETNAME is the name of the bucket.

    For example, if you are configuring access key for a bucket called "nightly", instead of using fs.s3a.access.key property name, use fs.s3a.bucket.nightly.access.key.

  2. When connecting to a bucket, all options explicitly set for that bucket will override the base fs.s3a. values, but they will not be picked up by other buckets.

Example

You may have a base configuration to use the IAM role information available when deployed in Amazon EC2 VM or an AWS container service.

<property>
  <name>fs.s3a.aws.credentials.provider</name>
  <value>org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider</value>
</property>

This will be the default authentication mechanism for S3A buckets.

A bucket s3a://nightly/ used for nightly data uses a session key, so its bucket-specific configuration is:
<property>
  <name>fs.s3a.bucket.nightly.access.key</name>
  <value>AKAACCES-SKEY-2</value>
</property>

<property>
  <name>fs.s3a.bucket.nightly.secret.key</name>
  <value>SESSION-SECRET-KEY</value>
</property>

<property>
  <name>fs.s3a.bucket.nightly.session.token</name>
  <value>SHORT-LIVED-SESSION-TOKEN</value>
</property>

<property>
  <name>fs.s3a.bucket.nightly.aws.credentials.provider</name>
  <value>org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider</value>
</property>
Finally, the public s3a://landsat-pds/ bucket could be accessed anonymously, so its bucket-specific configuration is:
<property>
   <name>fs.s3a.bucket.landsat-pds.aws.credentials.provider</name>
   <value>org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider</value>
</property>

For all other buckets, the base configuration is used.