18. Validating Your Data

Verify that your data is intact by comparing the HDFS data directory tree with your previous directory tree.

  1. Run the following command as the hadoop user, from the Hadoop home directory:

    runas /user:hadoop "cmd /K cd %HDFS_HOME%\bin"

  2. Run an lsr report on your upgraded system. Enter the following at the Hadoop command line:

    hdfs fs -ls R / > fs-new-lsr-1.log 

  3. Compare the directory listing to the older HDP directories. All old directories, files and timestamps should match. There will also be some new entries in the HDP directory listing:

    • /apps/hbase is used by HBase

    • /mapred/system/jobtracker should have a new timestamp

  4. Run an fsck report on your upgraded system. From the Hadoop command line:

    hdfs fsck / -blocks -locations -files > fsck-new-report-1.log
              

  5. To check the validity of your current HDFS data, compare this fsck report to the report generated before the upgrade.