HDFS Administration
Also available as:
PDF

Perform a Backup of the HDFS Metadata

Use the following procedure to backup HDFS metadata without affecting the availability of NameNode:

  1. Make sure the Standby NameNode checkpoints the namespace to fsimage_ once per hour.

  2. Deploy monitoring on both NameNodes to confirm that checkpoints are triggering regularly. This helps reduce the amount of missing transactions in the event that you need to restore from a backup containing only fsimage files without subsequent edit logs. It is good practice to monitor this anyway, because huge uncheckpointed edit logs can cause long delays after a NameNode restart while it replays those transactions.

  3. Back up the most recent “fsimage_*” and “fsimage_*.md5” from the standby NameNode periodically. Try to keep the latest version of the file on another machine in the cluster.

  4. Back up the VERSION file from the standby NameNode.