Known Issues in HDFS

Learn about the known issues in HDFS, the impact or changes to the functionality, and the workaround.

OPSAPS-55788: WebHDFS is always enabled. The Enable WebHDFS checkbox does not take effect.
None.
Unsupported Features
The following HDFS features are currently not supported in Cloudera Data Platform:

Technical Service Bulletins

TSB 2023-666: Out of order HDFS snapshot deletion may delete renamed/moved files, which may result in data loss
Cloudera has discovered a bug in the Apache Hadoop Distributed File System (HDFS) snapshot implementation. Deleting an HDFS snapshot may incorrectly remove files in the .Trash directories or remove renamed files from the current file system state. This is an unexpected behavior because deleting an HDFS snapshot should only delete the files stored in the specified snapshot, but not data in the current state.
In the particular HDFS installation in which the bug was discovered, deleting one of the snapshots caused certain files to be moved to trash and deletion of some of the files in a .Trash directory. Although it is clear that the conditions of the bug are (1) out-of-order snapshot deletion and (2) files moved to trash or other directories, we were unable to replicate the bug in other HDFS installations after executing similar test operations with a variety of different sequences. We also did not observe any actual data loss in our tests. However, there is a remote possibility that this bug may lead to data loss.
Components affected
HDFS
Products affected
  • Cloudera Data Platform (CDP)
  • Cloudera Distribution including Apache Hadoop (CDH)
  • Hortonworks Data Platform (HDP)
Releases affected
  • CDP Private Cloud Base 7.1.7 SP2 CHF6 and earlier; 7.1.8 CHF7 and earlier
  • All versions of CDP Public Cloud
  • All versions of CDH
  • All versions of HDP
Users affected
Cloudera customers using the HDFS snapshot feature.
Impact
When the files are removed incorrectly by deleting a snapshot, the standby namenode checkpoint (or the namenode checkpoint for non-High Availability clusters) fails with missing INode file and the namenode shuts down with a “Missing INode” error message in the logs as shown in the example below. This can result in data loss as the current file data stored in HDFS can be removed incorrectly when deleting a HDFS snapshot.
2023-04-14 10:04:11,175 [FSImageSaver for /grid/1/dfs/namenode/current of type IMAGE_AND_EDITS] ERROR namenode.FSImage (FSImageFormatPBINode.java:serializeINodeDirectorySection(765)) - FSImageFormatPBINode#serializeINodeDirectorySection: Dangling child pointer found. Missing INode in inodeMap: id=154614; path=/user/foo/.Trash/Current/file; parent=/user/foo/.Trash/Current
If there are no failures in checkpointing, then there is no potential for data loss in the cluster.
Severity
High
Action required
  • Risk Avoidance:
    • When deleting multiple snapshots, delete them in order: from the earliest to the latest. This will reduce the risk of data loss as it is a proven experience that deleting the earliest snapshot in the file system will not cause data loss.
    • To determine the snapshot creation order, use the hdfs lsSnapshot <snapshotDir> command, and then sort the output by the snapshot ID.
    • If snapshot A is created earlier than snapshot B, the snapshot ID of A is smaller than the snapshot ID of B. The following is the output format of lsSnapshot:
      <permission> <replication> <owner> <group> <length> <modification_time> <snapshot_id> <deletion_status> <path>
  • Upgrade (Highly Recommended)
  • Hotfixes (if any)

    CDH or HDP customers should upgrade to one of the fixed CDP releases mentioned above or contact support to request a hotfix for HDFS-16972, HDFS-16975, and HDFS-17045.

  • Knowledge article

    For the latest update on this issue see the corresponding Knowledge article: TSB 2023-666: Out of order HDFS snapshot deletion may delete renamed/moved files, which may result in data loss