Referencing S3 in the URLs
Regardless of which specific Hadoop ecosystem application you are using, you can access data stored in Amazon S3 using the URL starting with the s3a:// prefix followed by bucket name and path to file or directory.
The URL structure is:
s3a://<bucket>/<dir>/<file>
For example, to access a file called "mytestfile" in a directory called "mytestdir", which is stored in a bucket called "mytestbucket", the URL is:
s3a://mytestbucket/mytestdir/mytestfile
The following FileSystem shell commands demonstrate access to a bucket named
mytestbucket
:
hadoop fs -ls s3a://mytestbucket/ hadoop fs -mkdir s3a://mytestbucket/testDir hadoop fs -put testFile s3a://mytestbucket/testFile hadoop fs -cat s3a://mytestbucket/testFile test file content