Connect to external Amazon S3 buckets

Every language in Cloudera Machine Learning has libraries available for uploading to and downloading from Amazon S3.

To work with external S3 buckets in Python, do the following:

Python

# Install Boto to the project
%pip install boto3

import boto3 

s3 = boto3.client('s3')

# Print out bucket names
for bucket in s3.buckets.all():
  print(bucket.name)

# Download a file
s3.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME')