Every language in Cloudera Data Science Workbench has libraries available for
uploading to and downloading from Amazon S3.
-
Add your Amazon Web Services access keys to your project's environment variables
as
AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
.
- Pick your favorite language from the code samples below. Each one downloads the R 'Old
Faithful' dataset from S3.
R
library("devtools")
install_github("armstrtw/AWS.tools")
Sys.setenv("AWSACCESSKEY"=Sys.getenv("AWS_ACCESS_KEY_ID"))
Sys.setenv("AWSSECRETKEY"=Sys.getenv("AWS_SECRET_ACCESS_KEY"))
library("AWS.tools")
s3.get("s3://sense-files/faithful.csv")
Python
# Install Boto to the project
!pip install boto
# Create the Boto S3 connection object.
from boto.s3.connection import S3Connection
aws_connection = S3Connection()
# Download the dataset to file 'faithful.csv'.
bucket = aws_connection.get_bucket('sense-files')
key = bucket.get_key('faithful.csv')
key.get_contents_to_filename('/home/cdsw/faithful.csv')