A more secure way to store your OAuth credentials is with the
Hadoop CredentialProvider. When you submit a job, reference the
CredentialProvider, which then supplies the OAuth information.
Unlike the core-site.xml
, the credentials are not
stored in plain text.
The following steps describe how to create a credential provider
and how to reference it when submitting jobs:
- Create a password for the Hadoop Credential Provider and
export it to the environment:
export HADOOP_CREDSTORE_PASSWORD=password
- Provision the credentials by running the following commands:
hadoop credential create dfs.adls.oauth2.client.id -provider jceks://hdfs/user/USER_NAME/adls2keyfile.jceks -value client ID
hadoop credential create dfs.adls.oauth2.credential -provider jceks://hdfs/user/USER_NAME/adls2keyfile.jceks -value client secret
hadoop credential create dfs.adls.oauth2.refresh.url -provider jceks://hdfs/user/USER_NAME/adls2keyfile.jceks -value refresh URL
You can omit the -value
option and its value and
the command will prompt the user to enter the value.
For more details on the hadoop credential
command,
see Credential Management (Apache
Software Foundation).
- Export the password to the environment:
export HADOOP_CREDSTORE_PASSWORD=password
- Reference the credential provider on the command line when
you submit a job:
hadoop <command>
-Ddfs.adls.oauth2.access.token.provider.type=ClientCredential \
-Dhadoop.security.credential.provider.path=jceks://hdfs/user/USER_NAME/adls-cred.jceks \
abfs[s]://<file_system>@<account_name>.dfs.core.windows.net/<path>/<file_name>