Known Issues in HDFS

Learn about the known issues in HDFS, the impact or changes to the functionality, and the workaround.

CDPD-28459: After performing an upgrade rollback from CDP 7.1.7 to CDH6, you may see the following error when restarting the DataNodes: ERROR datanode.DataNode: Exception in secureMain java.io.IOException: The path component: '/var/run/hdfs-sockets' in '/var/run/hdfs-sockets/dn' has permissions 0755 uid 39998 and gid 1006. It is not protected because it is owned by a user who is not root and not the effective user: '0'.

You must run the command described in the error message "chown root /var/run/hdfs-sockets". After this, the DataNode will restart successfully.

CDPD-28390: Rolling restart of the HDFS JournalNodes may time out on Ubuntu20.

If the restart operation times out, you can manually stop and restart the Name Node and Journal Node services one by one.

OPSAPS-60832: When decommission of DN runs for a longer time and when decommission monitor's kerberos ticket expires, it is not auto-renewed. Decommission of DN is not completed in CM as decommission monitor fails to fetch the state of DN after kerberos ticket expiry.

Decommission state of DN can be fetched using CLI command, i.e, hdfs dfsadmin -report.

Unsupported Features

The following HDFS features are currently not supported in Cloudera Data Platform:

ACLs for the NFS gateway (HADOOP-11004)
Aliyun Cloud Connector (HADOOP-12756)
Allow HDFS block replicas to be provided by an external storage system (HDFS-9806)
Consistent standby Serving reads (HDFS-12943)
Cost-Based RPC FairCallQueue (HDFS-14403)
HDFS Router Based Federation (HDFS-10467)
More than two NameNodes (HDFS-6440)
NameNode Federation (HDFS-1052)
NameNode Port-based Selective Encryption (HDFS-13541)
Non-Volatile Storage Class Memory (SCM) in HDFS Cache Directives (HDFS-13762)
OpenStack Swift (HADOOP-8545)
SFTP FileSystem (HADOOP-5732)
Storage policy satisfier (HDFS-10285)

Technical Service Bulletins

TSB 2022-604: GetContentSummary call performance issues with Apache Ranger HDFS plugin: With Apache Ranger enabled on the NameNode, getContentSummary calls in the Apache Hadoop Distributed File System (HDFS) lock for multiple seconds and can cause NameNode failover.
Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2022-604: GetContentSummary call performance issues with Apache Ranger HDFS plugin

TSB 2023-666: Out of order HDFS snapshot deletion may delete renamed/moved files, which may result in data loss: Cloudera has discovered a bug in the Apache Hadoop Distributed File System (HDFS) snapshot implementation. Deleting an HDFS snapshot may incorrectly remove files in the .Trash directories or remove renamed files from the current file system state. This is an unexpected behavior because deleting an HDFS snapshot should only delete the files stored in the specified snapshot, but not data in the current state.
In the particular HDFS installation in which the bug was discovered, deleting one of the snapshots caused certain files to be moved to trash and deletion of some of the files in a .Trash directory. Although it is clear that the conditions of the bug are (1) out-of-order snapshot deletion and (2) files moved to trash or other directories, we were unable to replicate the bug in other HDFS installations after executing similar test operations with a variety of different sequences. We also did not observe any actual data loss in our tests. However, there is a remote possibility that this bug may lead to data loss.
Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2023-666: Out of order HDFS snapshot deletion may delete renamed/moved files, which may result in data loss