Back up HDFS metadata using Cloudera Manager

HDFS metadata backups can be used to restore a NameNode when both NameNode roles have failed. In addition, Cloudera recommends backing up HDFS metadata before a major upgrade.

This backup method requires you to shut down the cluster.
  1. Note the active NameNode.
  2. Stop the cluster.
    It is particularly important that the NameNode role process is not running so that you can make a consistent backup.
  3. Go to the HDFS service.
  4. Click the Configuration tab.
  5. In the Search field, search for "NameNode Data Directories" and note the value.
  6. On the active NameNode host, back up the directory listed in the NameNode Data Directories property. If a file with the extension lock exists in the NameNode data directory, the NameNode most likely is still running. Repeat the steps, beginning with shutting down the NameNode role.
    If more than one is listed, make a backup of one directory, because each directory is a complete copy. For example, if the NameNode data directory is /data/dfs/nn, do the following as root:
    # cd /data/dfs/nn
    # tar -cvf /root/nn_backup_data.tar .

    You should see output like this:

    /dfs/nn/current
    
    ./
    ./VERSION
    ./edits_0000000000000000001-0000000000000008777
    ./edits_0000000000000008778-0000000000000009337
    ./edits_0000000000000009338-0000000000000009897
    ./edits_0000000000000009898-0000000000000010463
    ./edits_0000000000000010464-0000000000000011023
    <snip>
    ./edits_0000000000000063396-0000000000000063958
    ./edits_0000000000000063959-0000000000000064522
    ./edits_0000000000000064523-0000000000000065091
    ./edits_0000000000000065092-0000000000000065648
    ./edits_inprogress_0000000000000065649
    ./fsimage_0000000000000065091
    ./fsimage_0000000000000065091.md5
    ./fsimage_0000000000000065648
    ./fsimage_0000000000000065648.md5
    ./seen_txid