Authentication Failures
You may encounter the following S3 authentication issues.
Authentication Failure Due to Signature Mismatch
If Hadoop cannot authenticate with the S3 service endpoint, the client retries a number of times before eventually failing. When it finally gives up, it will report a message about signature mismatch:
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we calculated does not match the signature you provided. Check your key and signing method. (Service: AmazonS3; StatusCode: 403; ErrorCode: SignatureDoesNotMatch,
The likely cause is that you either have the wrong credentials for any of the current authentication mechanism(s) — or somehow the credentials were not readable on the host attempting to read or write the S3 bucket.
Enabling debug logging for the package org.apache.hadoop.fs.s3a
can help
provide more information.
The standard first step is: try to use the AWS command line tools with the same credentials, through a command such as:
hdfs fs -ls s3a://my-bucket/
Note the trailing "/" here; without that the shell thinks you are trying to list your home directory under the bucket, which will only exist if explicitly created.
Attempting to list a bucket using inline credentials is a means of verifying that the key and secret can access a bucket:
hdfs fs -ls s3a://key:secret@my-bucket/
Do escape any
+
or/
symbols in the secret, as discussed below, and never share the URL, logs generated using it, or use such an inline authentication mechanism in production.Finally, if you set the environment variables, you can take advantage of S3A's support of environment-variable authentication by attempting the same ls operation; that is, unset the
fs.s3a
secrets and rely on the environment variables.Make sure that the name of the bucket is the correct one. That is, check the URL.
Make sure the property names are correct. For S3A, they are
fs.s3a.access.key
andfs.s3a.secret.key
. You cannot just copy the S3N properties and replaces3n
withs3a
.Make sure that the properties are visible to the process attempting to talk to the object store. Placing them in
core-site.xml
is the standard mechanism.If using session authentication, the session may have expired. Generate a new session token and secret.
If using environment variable-based authentication, make sure that the relevant variables are set in the environment in which the process is running.
There are a couple of system configuration problems (JVM version, system clock) that you should check.
Authentication Failure Due to Clock Skew
The timestamp is used in signing to S3, so as to defend against replay attacks. If the system clock is too far behind or ahead of Amazon's, requests will be rejected.
This can surface as the situation where read requests are allowed, but operations which write to the bucket are denied.
Solution: Check the system clock.
Authentication Failure When Using URLs with Embedded Secrets
If you are using the strongly discouraged mechanism of including the AWS key and secret in a URL, make sure that both "+" and "/" symbols are encoded in the URL. As many AWS secrets include these characters, encoding problems are not uncommon.
Use this table for conversion:
Symbol | Encoded Value |
---|---|
+ | %2B |
/ | %2F |
For example, a URL for an S3 bucket
with AWS ID user1
and
secret a+b/c
will be represented as
s3a://user1:a%2Bb%2Fc@bucket
You only need to use this technique when placing secrets in the URL.
Authentication Failures When Running on Java 8u60+
A change in the Java 8 JVM broke some of the toString()
string generation
of Joda Time 2.8.0, which stopped the Amazon S3 client from being able to generate
authentication headers suitable for validation by S3.
Solution: Make sure that the version of Joda Time is 2.8.1 or later, or use a new version of Java 8.