Accessing Cloud DataPDF version

Improving Performance for S3A

You can consider various options for improving performance when working with data stored in Amazon S3.

The bandwidth between the Cloudera cluster and Amazon S3 is the upper limit to how fast data can be copied into S3. The further the Cloudera cluster is from the Amazon S3 installation, or the narrower the network connection is, the longer the operation will take. Even a Cloudera cluster deployed within Amazon's own infrastructure may encounter network delays from throttled VM network connections.

Network bandwidth limits notwithstanding, there are some options which can be used to tune the performance of an upload: