To set up Wire Encryption for Hadoop.
Create HTTPS certificates and keystore/truststore files.
For each host in the cluster, create a directory for storing the keystore and truststore. For example, SERVER_KEY_LOCATION. Also create a directory to store public certificate, for example, CLIENT_KEY_LOCATION.
mkdir -p $SERVER_KEY_LOCATION ; mkdir -p $CLIENT_KEY_LOCATION E.g: ssh host1.hwx.com “mkdir -p /etc/security/serverKeys ; mkdir -p /etc/security/clientKeys ; ”
For each host, create a keystore file.
cd $SERVER_KEY_LOCATION ; keytool -genkey -alias $hostname -keyalg RSA -keysize 1024 -dname \"CN=$hostname,OU=hw,O=hw,L=paloalto,ST=ca,C=us\" -keypass $SERVER_KEYPASS_PASSWORD -keystore $KEYSTORE_FILE -storepass $SERVER_STOREPASS_PASSWORD\”
For each host, export the certificate public key to a certificate file.
cd $SERVER_KEY_LOCATION ; keytool -export -alias $hostname -keystore $KEYSTORE_FILE -rfc -file $CERTIFICATE_NAME -storepass $SERVER_STOREPASS_PASSWORD\”
For each host, import the certificate into truststore file.
cd $SERVER_KEY_LOCATION ; keytool -import -noprompt -alias $hostname -file $CERTIFICATE_NAME -keystore $TRUSTSTORE_FILE -storepass $SERVER_TRUSTSTORE_PASSWORD
Create a single truststore file containing the public key from all certificates. Login to host1 and import the truststore file for host1.
keytool -import -noprompt -alias $host -file $CERTIFICATE_NAME -keystore $ALL_JKS -storepass $CLIENT_TRUSTSTORE_PASSWORD
Copy $ALL_JKS from host1 to other hosts, and repeat the above command. For example, for a 2-node cluster with host1 and host2:
Create $ALL_JKS on host1.
keytool -import -noprompt -alias $host -file $CERTIFICATE_NAME -keystore $ALL_JKS -storepass $CLIENT_TRUSTSTORE_PASSWORD
Copy over $ALL_JKS from host1 to host2. $ALL_JKS already has the certificate entry of host1.
Import certificate entry of host2 to $ALL_JKS using same command as before:
keytool -import -noprompt -alias $host -file $CERTIFICATE_NAME -keystore $ALL_JKS -storepass $CLIENT_TRUSTSTORE_PASSWORD
Copy over the updated $ALL_JKS from host2 to host1.
Note Repeat these steps each time for each node in the cluster. When you are finished, the $ALL_JKS file on host1 will have the certificates of all nodes.
Copy over the $ALL_JKS file from host1 to all the nodes.
Validate the common truststore file on all hosts.
keytool -list -v -keystore $ALL_JKS -storepass $CLIENT_TRUSTSTORE_PASSWORD
Set permissions and ownership on the keys:
chgrp -R $YARN_USER:hadoop $SERVER_KEY_LOCATION chgrp -R $YARN_USER:hadoop $CLIENT_KEY_LOCATION chown 755 $SERVER_KEY_LOCATION chown 755 $CLIENT_KEY_LOCATION chown 440 $KEYSTORE_FILE chown 440 $TRUSTSTORE_FILE chown 440 $CERTIFICATE_NAME chown 444 $ALL_JKS
Note The complete path of the
$SEVER_KEY_LOCATION
and theCLIENT_KEY_LOCATION
from the root directory /etc must be owned by the $YARN_USER user and the hadoop group.
Enable HTTPS by setting the following properties.
Set the following properties in core-site.xml. For example if you are using Ambari, set the properties as:
hadoop.ssl.require.client.cert=false hadoop.ssl.hostname.verifier=DEFAULT hadoop.ssl.keystores.factory.class=org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory hadoop.ssl.server.conf=ssl-server.xml hadoop.ssl.client.conf=ssl-client.xml
Set the following properties in ssl-server.xml. For example if you are using Ambari, set the properties as:
ssl.server.truststore.location=/etc/security/serverKeys/truststore.jks ssl.server.truststore.password=serverTrustStorePassword ssl.server.truststore.type=jks ssl.server.keystore.location=/etc/security/serverKeys/keystore.jks ssl.server.keystore.password=serverStorePassPassword ssl.server.keystore.type=jks ssl.server.keystore.keypassword=serverKeyPassPassword
Set the following properties in ssl-client.xml. For example if you are using Ambari, set the properties as:
ssl.client.truststore.location=/etc/security/clientKeys/all.jks ssl.client.truststore.password=clientTrustStorePassword ssl.client.truststore.type=jks
Set the following properties in hdfs-site.xml. For example if you are using Ambari, set the properties as:
dfs.https.enable=true dfs.datanode.https.address=0.0.0.0:<DN_HTTPS_PORT> dfs.https.port=<NN_HTTPS_PORT> dfs.namenode.https-address=<NN>:<NN_HTTPS_PORT>
Set the following properties in mapred-site.xml. For example if you are using Ambari, set the properties as:
mapreduce.jobhistory.http.policy=HTTPS_ONLY mapreduce.jobhistory.webapp.https.address=<JHS>:<JHS_HTTPS_PORT>
Set the following properties in yarn-site.xml. For example if you are using Ambari, set the properties as:
yarn.http.policy=HTTPS_ONLY yarn.log.server.url=https://<JHS>:<JHS_HTTPS_PORT>/jobhistory/logs yarn.resourcemanager.webapp.https.address=<RM>:<RM_HTTPS_PORT> yarn.nodemanager.webapp.https.address=0.0.0.0:<NM_HTTPS_PORT>
Enable Encrypted Shuffle by setting the follwing properties in mapred-site.xml. For example if you are using Ambari, set the properties as:
mapreduce.shuffle.ssl.enabled=true mapreduce.shuffle.ssl.file.buffer.size=65536
(The default buffer size is 65536. )
Enable Encrypted RPC by setting the follwing properties in core-site.xml. For example if you are using Ambari, set the properties as:
hadoop.rpc.protection=privacy
(Also supported are the ‘authentication’ and ‘integrity’ settings.)
Enable Encrypted DTP by setting the following properties in hdfs-site.xml. For example if you are using Ambari, set the properties as:
dfs.encrypt.data.transfer=true dfs.encrypt.data.transfer.algorithm=3des
(‘rc4’is also supported.)
Note Secondary Namenode is not supported with the HTTPS port. It can only be accessed by “http://<SNN>:50090”. WebHDFS, hsftp, and shortcircuitread are not supported with SSL enabled.
Integrate Oozie Hcatalog by adding following property to oozie-hcatalog job.properties. For example if you are using Ambari, set the properties as:
hadoop.rpc.protection=privacy
Note This property is in addition to any properties you must set for secure clusters.