Accessing Cloud Data
Also available as:
PDF
loading table of contents...

Working with Local Stores

A foundational step to getting good performance is working with stores close to the Hadoop cluster, where "close" is measured in network terms.

Maximum performance is achieved from working with Azure containers and S3 buckets in the same cloud location as any in-cloud the cluster. For example, if your cluster is in AWS North Virginia ("US East"), you will achieve best performance if your S3 bucket is in the same region.

In addition to improving performance, working with local buckets ensures that no bills are incurred for reading from the bucket.