Referencing S3 in the URLs

Regardless of which specific Hadoop ecosystem application you are using, you can access data stored in Amazon S3 using the URL starting with the s3a:// prefix followed by bucket name and path to file or directory.

The URL structure is:

s3a://<bucket>/<dir>/<file>

For example, to access a file called "mytestfile" in a directory called "mytestdir", which is stored in a bucket called "mytestbucket", the URL is:

s3a://mytestbucket/mytestdir/mytestfile

The following FileSystem shell commands demonstrate access to a bucket named mytestbucket:

hadoop fs -ls s3a://mytestbucket/

hadoop fs -mkdir s3a://mytestbucket/testDir

hadoop fs -put testFile s3a://mytestbucket/testFile

hadoop fs -cat s3a://mytestbucket/testFile
test file content

​Referencing S3 in the URLs

Referencing S3 in the URLs