Backing up NameNode metadata

You must back up the VERSION file and then back up the NameNode metadata.

  1. Make a single backup of the VERSION file.
    This does not need to be backed up regularly as it does not change, but it is important since it contains the clusterID, along with other details.
  2. Use the following command to back up the NameNode metadata.
    It automatically determines the active NameNode, retrieves the current fsimage, and places it in the defined backup_dir.
    hdfs dfsadmin -fetchImage backup_dir 
On startup, the NameNode process reads the fsimage file and commits it to memory. If the JournalNodes are up and running, and there are edit files present, any edits newer than the fsimage are also applied. If the JournalNodes are unavailable, it is possible to lose any data transferred in the interim.