1. Getting Ready to Upgrade

HDP Stack upgrade involves removing HDP 1.x MapReduce and replacing it with HDP 2.x Yarn and MapReduce2. Before you begin, review the upgrade process and complete the Backup steps.

  1. Back up the following HDP 1.x directories:

    • /etc/hadoop/conf

    • /etc/hbase/conf

    • /etc/hcatalog/conf

    • /etc/hive/conf

    • /etc/pig/conf

    • /etc/sqoop/conf

    • /etc/flume/conf

    • /etc/mahout/conf

    • /etc/oozie/conf

    • /etc/hue/conf

    • /etc/zookeeper/conf

    • Optional - Back up your userlogs directories, ${mapred.local.dir}/userlogs.

  2. Run the fsck command as the HDFS Service user and fix any errors. (The resulting file contains a complete block map of the file system.)

    su $HDFS_USER
    hadoop fsck / -files -blocks -locations > /tmp/dfs-old-fsck-1.log 

    where $HDFS_USER is the HDFS Service user. For example, hdfs.

  3. Use the following instructions to compare status before and after the upgrade:

    [Note]Note

    The following commands must be executed by the user running the HDFS service (by default, the user is hdfs).

    1. Capture the complete namespace of the file system. (The following command does a recursive listing of the root file system.)

      su $HDFS_USER
      hadoop dfs -lsr / > dfs-old-lsr-1.log 

      where $HDFS_USER is the HDFS Service user. For example, hdfs.

    2. Run the report command to create a list of DataNodes in the cluster.

      su $HDFS_USER
      hadoop dfsadmin -report > dfs-old-report-1.log

      where $HDFS_USER is the HDFS Service user. For example, hdfs.

    3. Optional - You can copy all or unrecoverable only data stored in HDFS to a local file system or to a backup instance of HDFS.

    4. Optional - You can also repeat the steps 3 (a) through 3 (c) and compare the results with the previous run to ensure the state of the file system remained unchanged.

  4. As the HDFS user, save the namespace by executing the following command:

    su $HDFS_USER
    hadoop dfsadmin -safemode enter
    hadoop dfsadmin -saveNamespace

  5. Backup your NameNode metadata.

    1. Copy the following checkpoint files into a backup directory:

      • dfs.name.dir/edits

      • dfs.name.dir/image/fsimage

      • dfs.name.dir/current/fsimage

    2. Store the layoutVersion of the namenode.

      ${dfs.name.dir}/current/VERSION

  6. Finalize the state of the filesystem.

    su $HDFS_USER
    hadoop namenode -finalize
  7. Optional - Backup the Hive Metastore database.

    [Note]Note

    These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions.

     

    Table 17.1. Hive Metastore Database Backup and Rstore

    Database Type BackupRestore

    MySQL

    mysqldump $dbname > $outputfilename.sql For example: mysqldump hive > /tmp/mydir/backup_hive.sql mysql $dbname < $inputfilename.sql For example: mysql hive < /tmp/mydir/backup_hive.sql

    Postgres

    sudo -u $username pg_dump $databasename > $outputfilename.sql For example: sudo -u postgres pg_dump hive > /tmp/mydir/backup_hive.sqlsudo -u $username psql $databasename < $inputfilename.sql For example: sudo -u postgres psql hive < /tmp/mydir/backup_hive.sql
    Oracle Connect to the Oracle database using sqlplus export the database: exp username/password@database full=yes file=output_file.dmp Import the database: imp username/password@database ile=input_file.dmp

  8. Optional - Backup the Oozie Metastore database.

    [Note]Note

    These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions.

     

    Table 17.2. Oozie Metastore Database Backup and Restore

    Database Type BackupRestore

    MySQL

    mysqldump $dbname > $outputfilename.sql For example: mysqldump oozie > /tmp/mydir/backup_oozie.sql mysql $dbname < $inputfilename.sql For example: mysql oozie < /tmp/mydir/backup_oozie.sql

    Postgres

    sudo -u $username pg_dump $databasename > $outputfilename.sql For example: sudo -u postgres pg_dump oozie > /tmp/mydir/backup_oozie.sqlsudo -u $username psql $databasename < $inputfilename.sql For example: sudo -u postgres psql oozie < /tmp/mydir/backup_oozie.sql

  9. Stop all services (including MapReduce) and client applications deployed on HDFS using the instructions provided here.


  10. Verify that edit logs in ${dfs.name.dir}/name/current/edits* are empty. These log files should have only 4 bytes of data, which contain the edit logs version. If the edit logs are not empty, start the existing version NameNode and then shut it down after a new fsimage has been written to disks so that the edit log becomes empty.