Transparent Encryption Recommendations for MapReduce and YARN
MapReduce v1 stores both history and logs on local disks by default. Even if you do configure history to be stored on HDFS, the files are not renamed. Hence, no special configuration is required.
Recommendations for MapReduce v2 (YARN)
Make /user/history a single encryption zone, because history files
are moved between the intermediate and done
directories, and HDFS encryption does not allow moving encrypted files across
encryption zones. When you create the encryption zone, name the key
mapred-key to take advantage of auto-generated KMS ACLs.
- Go to Ranger Admin UI.
- Login using keyadmin role credentials.
- Create a new policy for key resource name such as
mapred-keyif a policy does not already exist. - Grant DECRYPT_EEK permission to all end-users (
mapredandyarn) for that policy. - Save the changes.
Steps
On a cluster with MRv2 (YARN) installed, create the /user/history
directory and make that an encryption zone.
If /user/history already exists and is not empty:
- Create an empty
/user/history-tmpdirectory. - Make
/user/history-tmpan encryption zone. - DistCp all data from
/user/historyinto/user/history-tmp. - Remove
/user/historyand rename/user/history-tmpto/user/history.
