Apache Spark Component Guide
Also available as:
PDF
loading table of contents...

Configuring Spark2 for Wire Encryption

Use the following commands to configure Spark2 for wire encryption:

  1. On each node, create keystore files, certificates, and truststore files.

    1. Create a keystore file:

      keytool -genkey \
          -alias <host> \
          -keyalg RSA \
          -keysize 1024 \
          –dname CN=<host>,OU=hw,O=hw,L=paloalto,ST=ca,C=us \
          –keypass <KeyPassword> \
          -keystore <keystore_file> \
          -storepass <storePassword>
    2. Create a certificate:

      keytool -export \
          -alias <host> \
          -keystore <keystore_file> \
          -rfc –file <cert_file> \
          -storepass <StorePassword>
    3. Create a truststore file:

      keytool -import \
          -noprompt \
          -alias <host> \
          -file <cert_file> \
          -keystore <truststore_file> \
          -storepass <truststorePassword>
  2. Create one truststore file that contains the public keys from all certificates.

    1. Log on to one host and import the truststore file for that host:

      keytool -import \
          -noprompt \
          -alias <hostname> \
          -file <cert_file> \
          -keystore <all_jks> \
          -storepass <allTruststorePassword>
    2. Copy the <all_jks> file to the other nodes in your cluster, and repeat the keytool command on each node.

  3. Enable Spark2 authentication.

    1. Set spark.authenticate to true in the yarn-site.xml file:

      <property>
        <name>spark.authenticate</name>
        <value>true</value>
      </property>
    2. Set the following properties in the spark-defaults.conf file:

      spark.authenticate true
      spark.authenticate.enableSaslEncryption true
  4. Enable Spark2 SSL.

    Set the following properties in the spark-defaults.conf file:

    spark.ssl.enabled true
    spark.ssl.enabledAlgorithms TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA
    spark.ssl.keyPassword <KeyPassword>
    spark.ssl.keyStore <keystore_file>
    spark.ssl.keyStorePassword <storePassword>
    spark.ssl.protocol TLS
    spark.ssl.trustStore <all_jks>
    spark.ssl.trustStorePassword <allTruststorePassword>
  5. Enable HTTPS for the Spark2 UI.

    Set spark.ui.https.enabled to true in the spark-defaults.conf file:

    spark.ui.https.enabled true

    Note: In Spark2, enabling wire encryption also enables HTTPS on the History Server UI, for browsing job history data.

  6. (Optional) If you want to enable optional on-disk block encryption, which applies to both shuffle and RDD blocks on disk, complete the following steps:

    1. Add the following properties to the spark-defaults.conf file for Spark2:

      spark.io.encryption.enabled true
      spark.io.encryption.keySizeBits 128
      spark.io.encryption.keygen.algorithm HmacSHA1
    2. Enable RPC encryption.

    For more information, see the Shuffle Behavior section of Apache Spark Properties documentation, and the Apache Spark Security documentation.