Encrypting Data at RestPDF version

Transparent Encryption Recommendations for MapReduce and YARN

MapReduce v1 stores both history and logs on local disks by default. Even if you do configure history to be stored on HDFS, the files are not renamed. Hence, no special configuration is required.

Make /user/history a single encryption zone, because history files are moved between the intermediate and done directories, and HDFS encryption does not allow moving encrypted files across encryption zones. When you create the encryption zone, name the key mapred-key to take advantage of auto-generated KMS ACLs.

On a cluster with MRv2 (YARN) installed, create the /user/history directory and make that an encryption zone.

If /user/history already exists and is not empty:

  1. Create an empty /user/history-tmp directory.
  2. Make /user/history-tmp an encryption zone.
  3. DistCp all data from /user/history into /user/history-tmp.
  4. Remove /user/history and rename /user/history-tmp to /user/history.
In the KMS ACLs, grant DECRYPT_EEK permission for the MapReduce key to the mapred and Yarn users and the hadoop group:
<property>
  <name>key.acl.mapred-key.DECRYPT_EEK</name>
  <value>mapred,yarn hadoop</value>
  </description>
</property>