Using a Credential Provider to Secure S3 Credentials
You can run the
sqoop
command without entering the access key and secret
key on the command line. This prevents these credentials from being exposed in the console
output, log files, configuration files, and other artifacts. Running the command this way
requires that you provision a credential store to securely store the access key and secret
key. The credential store file is saved in HDFS.
To provision credentials in a credential store:
- Provision the credentials by running the following commands:
hadoop credential create fs.s3a.access.key -value access_key -provider jceks://hdfs/path_to_credential_store_file hadoop credential create fs.s3a.secret.key -value secret_key -provider jceks://hdfs/path_to_credential_store_file
For example:hadoop credential create fs.s3a.access.key -value foobar -provider jceks://hdfs/user/alice/home/keystores/aws.jceks hadoop credential create fs.s3a.secret.key -value barfoo -provider jceks://hdfs/user/alice/home/keystores/aws.jceks
You can omit the
-value
option and its value. When the option is omitted, the command will prompt the user to enter the value. - Copy the contents of the
/etc/hadoop/conf
directory to a working directory. - Add the following to the
core-site.xml
file in the working directory:<property> <name>hadoop.security.credential.provider.path</name> <value>jceks://hdfs/path_to_credential_store_file</value> </property>
- Set the
HADOOP_CONF_DIR
environment variable to the location of the working directory:export HADOOP_CONF_DIR=path_to_working_directory
After completing these steps, you can run the sqoop
command using the
following syntax:
Import into a target directory in an Amazon S3 bucket while credentials are stored in a
credential store file and its path is set in the core-site.xml.
sqoop import --connect $CONN --username $USER --password $PWD --table $TABLENAME --target-dir s3a://example-bucket/target-directory
You can also reference the credential store on the command line, without having to enter
it in a copy of the core-site.xml file. You also do not have to set a value for
HADOOP_CONF_DIR
. Use the following syntax:
Import into a target directory in an Amazon S3 bucket while credentials are stored in a
credential store file and its path is passed on the command line.
sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfspath-to-credential-store-file --connect $CONN --username $USER --password $PWD --table $TABLENAME --target-dir s3a://example-bucket/target-directory