Transparent Encryption Recommendations for MapReduce and YARN
MapReduce v1 stores both history and logs on local disks by default. Even if you do configure history to be stored on HDFS, the files are not renamed. Hence, no special configuration is required.
Recommendations for MapReduce v2 (YARN)
Make /user/history
a single encryption zone, because history files
are moved between the intermediate
and done
directories, and HDFS encryption does not allow moving encrypted files across
encryption zones. When you create the encryption zone, name the key
mapred-key
to take advantage of auto-generated KMS ACLs.
Steps
On a cluster with MRv2 (YARN) installed, create the /user/history
directory and make that an encryption zone.
If /user/history
already exists and is not empty:
- Create an empty
/user/history-tmp
directory. - Make
/user/history-tmp
an encryption zone. - DistCp all data from
/user/history
into/user/history-tmp
. - Remove
/user/history
and rename/user/history-tmp
to/user/history
.