Troubleshooting HDFS Encryption
This topic contains HDFS Encryption-specific troubleshooting information in the form of issues you might face when encrypting HDFS files/directories and their workarounds.
- KMS server jute buffer exception
- Retrieval of encryption keys fails
- DistCp between unencrypted and encrypted locations fails
- (CDH 5.6 and lower) Cannot move encrypted files to trash
- NameNode - KMS communication fails after long periods of inactivity
- HDFS Trash Behaviour with Transparent Encryption Enabled
KMS server jute buffer exception
You see the following error when the KMS (for example, as a ZooKeeper client) jute buffer size is insufficient to hold all the tokens:2017-01-31 21:23:56,416 WARN org.apache.zookeeper.ClientCnxn: Session 0x259f5fb3c1000fb for server, unexpected error, closing socket connection and attempting reconnect Packet len4196356 is out of range!
Retrieval of encryption keys fails
DistCp between unencrypted and encrypted locations fails
(CDH 5.6 and lower) Cannot move encrypted files to trash
NameNode - KMS communication fails after long periods of inactivity
Encrypted files and encryption zones cannot be created if a long period of time (by default, 20 hours) has passed since the last time the KMS and NameNode communicated.
- You can increase the KMS authentication token validity period to a very high number. Since the default value is 10 hours, this bug will only be encountered after 20 hours of no
communication between the NameNode and the KMS. Add the following property to the kms-site.xmlSafety Valve:
<property> <name>hadoop.kms.authentication.token.validity</name> <value>SOME VERY HIGH NUMBER</value> </property>
- You can switch the KMS signature secret provider to the string secret provider by adding the following property to the kms-site.xml Safety Valve:
<property> <name>hadoop.kms.authentication.signature.secret</name> <value>SOME VERY SECRET STRING</value> </property>
HDFS Trash Behaviour with Transparent Encryption Enabled
The Hadoop trash feature helps prevent accidental deletion of files and directories. When you delete a file in HDFS, the file is not immediately expelled from HDFS. Deleted files are first moved to the /user/<username>/.Trash/Current directory, with their original filesystem path being preserved. After a user-configurable period of time (fs.trash.interval), a process known as trash checkpointing renames the Current directory to the current timestamp, that is, /user/<username>/.Trash/<timestamp>. The checkpointing process also checks the rest of the .Trash directory for any existing timestamp directories and removes them from HDFS permanently. You can restore files and directories in the trash simply by moving them to a location outside the .Trash directory.
Trash Behaviour with HDFS Transparent Encryption Enabled
Starting with CDH 5.7, you can delete files or directories that are part of an HDFS encryption zone. As is evident from the procedure described above, moving and renaming files or directories is an important part of trash handling in HDFS. However, currently HDFS transparent encryption only supports renames within an encryption zone. To accommodate this, HDFS creates a local .Trash directory every time a new encryption zone is created. For example, when you create an encryption zone, enc_zone, HDFS will also create the /enc_zone/.Trash/ subdirectory. Files deleted from enc_zone are moved to /enc_zone/.Trash/<username>/Current/. After the checkpoint, the Current directory is renamed to the current timestamp, /enc_zone/.Trash/<username>/<timestamp>.
If you delete the entire encryption zone, it will be moved to the .Trash directory under the user's home directory, /users/<username>/.Trash/Current/enc_zone. Trash checkpointing will occur only after the entire zone has been moved to /users/<username>/.Trash. However, if the user's home directory is already part of an encryption zone, then attempting to delete an encryption zone will fail because you cannot move or rename directories across encryption zones.