Backing Up and Restoring NameNode Metadata

This topic describes the steps for backing up and restoring NameNode metadata.

Backing Up NameNode Metadata

This section describes how to back up NameNode metadata.
  1. Make a single backup of the VERSION file. This does not need to be backed up regularly as it does not change, but it is important since it contains the clusterID, along with other details.
  2. Use the following command to back up the NameNode metadata. It automatically determines the active NameNode, retrieves the current fsimage, and places it in the defined backup_dir.
    $ hdfs dfsadmin -fetchImage backup_dir 

On startup, the NameNode process reads the fsimage file and commits it to memory. If the JournalNodes are up and running, and there are edit files present, any edits newer than the fsimage are also applied. If the JournalNodes are unavailable, it is possible to lose any data transferred in the interim.

Restoring NameNode Metadata

This section describes how to restore NameNode metadata. If both the NameNode and the secondary NameNode were to suddenly go offline, you can restore the NameNode by doing the following:
  1. Add a new host to your Hadoop cluster.
  2. Add the NameNode role to the host. Make sure it has the same hostname as the original NameNode.
  3. Create a directory path for the NameNode name.dir (for example, /dfs/nn/current), ensuring that the permissions are set correctly.
  4. Copy the VERSION and latest fsimage file to the /dfs/nn/current directory.
  5. Run the following command to create the md5 file for the fsimage.
    $ md5sum fsimage > fsimage.md5
  6. Start the NameNode process.