Accessing Cloud Data
Also available as:
PDF
loading table of contents...

Running FS Shell Commands

Many of the standard Hadoop FileSystem shell commands that interact with HDFS also can be used to interact with Cloud Object Stores. They can be useful for a few specific purposes including confirming that the authentication with your cloud service works, debugging, browsing files and creating directories (as an alternative to the cloud service-specific tools), and other management operations.

When running the commands, provide a fully qualified URL. The commands use the following syntax

hadoop fs -<operation> URL

where <operation> indicates a particular action to be performed against a directory or a file.

For example, the following command lists all files in a directory called "dir1", which resides in an Amazon S3 bucket called "bucket1":

hadoop fs -ls s3a://bucket1/dir1

Examples

Create directories and create or copy files into them:

# Create a directory
hadoop fs -mkdir s3a://bucket1/datasets

# Upload a file from the cluster filesystem
hadoop fs -put /datasets/example.orc s3a://bucket1/datasets/

# Touch a file
hadoop fs -touchz s3a://bucket1/datasetstouched

Download and view objects:

# Copy a directory to the local filesystem
hadoop fs -copyToLocal s3a://bucket1/datasets/

# Copy a file from the object store to the local filesystem
hadoop fs -get s3a://bucket1/hello.txt /examples

# Print the object
hadoop fs -cat s3a://bucket1/hello.txt

# Print the object, unzipping it if necessary
hadoop fs -text s3a://bucket1/hello.txt

# Download log files into a local file
hadoop fs -getmerge s3a://s3a://bucket1/logs\* log.txt  

Related Links

Commands That May Be Slower with Cloud Object Storage

Unsupported Filesystem Operations

Deleting Files on Cloud Object Stores

Overwriting Objects on Amazon S3

Timestamps on Cloud Object Stores