Transparent Encryption Recommendations for Spark

There are various recommendations to consider when configuring HDFS Transparent Encryption for Spark.

Recommendations

  • By default, application event logs are stored at /user/spark/applicationHistory, which can be made into an encryption zone.
  • Spark also optionally caches its JAR file at /user/spark/share/lib (by default), but encrypting this directory is not required.

KMS ACL Configuration for Spark

In the KMS ACL, grant DECRYPT_EEK permission for the Spark key to the spark user and any groups that can submit Spark jobs:

<property>
  <name>key.acl.spark-key.DECRYPT_EEK</name>
  <value>spark spark-users</value>
</property>