Hadoop Security Guide
Also available as:
PDF
loading table of contents...

Read and Write Files from/to an Encryption Zone

Clients and HDFS applications with sufficient HDFS and Ranger KMS permissions can read and write files from/to an encryption zone.

Overview of the client write process:

  1. The client writes to the encryption zone.

  2. The NameNode checks to make sure that the client has sufficient write access permissions. If so, the NameNode asks Ranger KMS to create a file-level key, encrypted with the encryption zone master key.

  3. The Namenode stores the file-level encrypted data encryption key (EDEK) generated by Ranger KMS as part of the file's metadata, and returns the EDEK to the client.

  4. The client asks Ranger KMS to decode the EDEK (to DEK), and uses the DEK to write encrypted data. Ranger KMS checks for permissions for the user before decrypting EDEK and producing the DEK for the client.

Overview of the client read process:

  1. The client issues a read request for a file in an encryption zone.

  2. The NameNode checks to make sure that the client has sufficient read access permissions. If so, the NameNode returns the file's EDEK and the encryption zone key version that was used to encrypt the EDEK.

  3. The client asks Ranger KMS to decrypt the EDEK. Ranger KMS checks for permissions to decrypt EDEK for the end user.

  4. Ranger KMS decrypts and returns the (unencrypted) data encryption key (DEK).

  5. The client uses the DEK to decrypt and read the file.

The preceding steps take place through internal interactions between the DFSClient, the NameNode, and Ranger KMS.

In the following example, the /zone_encr directory is an encrypted zone in HDFS.

To verify this, use the crypto -listZones command (as an HDFS administrator). This command lists the root path and the zone key for the encryption zone. For example:

# hdfs crypto -listZones
/zone_encr  key1

Additionally, the /zone_encr directory has been set up for read/write access by the hive user:

# hdfs dfs -ls /
 …
drwxr-x---   - hive   hive            0 2015-01-11 23:12 /zone_encr

The hive user can, therefore, write data to the directory.

The following examples use the copyFromLocal command to move a local file into HDFS.

[hive@blue ~]# hdfs dfs -copyFromLocal web.log /zone_encr
[hive@blue ~]# hdfs dfs -ls /zone_encr
Found 1 items
-rw-r--r--   1 hive hive       1310 2015-01-11 23:28 /zone_encr/web.log

The hive user can read data from the directory, and can verify that the file loaded into HDFS is readable in its unencrypted form.

[hive@blue ~]# hdfs dfs -copyToLocal /zone_encr/web.log read.log
[hive@blue ~]# diff web.log read.log
[Note]Note

For more information about accessing encrypted files from Hive and other components, see Configuring HDP Services for HDFS Encryption.

Users without access to KMS keys will be able to see file names (via the -ls command), but they will not be able to write data or read from the encrypted zone. For example, the hdfs user lacks sufficient permissions, and cannot access the data in /zone_encr:

[hdfs@blue ~]# hdfs dfs -copyFromLocal install.log /zone_encr
copyFromLocal: Permission denied: user=hdfs, access=EXECUTE, inode="/zone_encr":hive:hive:drwxr-x---

[hdfs@blue ~]# hdfs dfs -copyToLocal /zone_encr/web.log read.log
copyToLocal: Permission denied: user=hdfs, access=EXECUTE, inode="/zone_encr":hive:hive:drwxr-x---