HDP Stack upgrade involves removing HDP 1.x MapReduce and replacing it with HDP 2.x Yarn and MapReduce2. Before you begin, review the upgrade process and complete the Backup steps.
Back up the following HDP 1.x directories:
/etc/hadoop/conf
/etc/hbase/conf
/etc/hcatalog/conf
/etc/hive/conf
/etc/pig/conf
/etc/sqoop/conf
/etc/flume/conf
/etc/mahout/conf
/etc/oozie/conf
/etc/hue/conf
/etc/zookeeper/conf
Optional - Back up your userlogs directories,
${mapred.local.dir}/userlogs
.
Run the
fsck
command as the HDFS Service user and fix any errors. (The resulting file contains a complete block map of the file system.)su $HDFS_USER hadoop fsck / -files -blocks -locations > /tmp/dfs-old-fsck-1.log
where
$HDFS_USER
is the HDFS Service user. For example,hdfs
.Use the following instructions to compare status before and after the upgrade:
Note The following commands must be executed by the user running the HDFS service (by default, the user is
hdfs
).Capture the complete namespace of the file system. (The following command does a recursive listing of the root file system.)
su $HDFS_USER hadoop dfs -lsr / > dfs-old-lsr-1.log
where
$HDFS_USER
is the HDFS Service user. For example,hdfs
.Run the report command to create a list of DataNodes in the cluster.
su $HDFS_USER hadoop dfsadmin -report > dfs-old-report-1.log
where
$HDFS_USER
is the HDFS Service user. For example,hdfs
.Optional - You can copy all or unrecoverable only data stored in HDFS to a local file system or to a backup instance of HDFS.
Optional - You can also repeat the steps 3 (a) through 3 (c) and compare the results with the previous run to ensure the state of the file system remained unchanged.
As the HDFS user, save the namespace by executing the following command:
su $HDFS_USER hadoop dfsadmin -safemode enter hadoop dfsadmin -saveNamespace
Backup your NameNode metadata.
Copy the following checkpoint files into a backup directory:
dfs.name.dir/edits
dfs.name.dir/image/fsimage
dfs.name.dir/current/fsimage
Store the
layoutVersion
of the namenode.${dfs.name.dir}/current/VERSION
Finalize the state of the filesystem.
su $HDFS_USER hadoop namenode -finalize
Optional - Backup the Hive Metastore database.
Note These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions.
Table 17.1. Hive Metastore Database Backup and Rstore
Database Type Backup Restore MySQL
mysqldump $dbname > $outputfilename.sql
For example:mysqldump hive > /tmp/mydir/backup_hive.sql
mysql $dbname < $inputfilename.sql
For example:mysql hive < /tmp/mydir/backup_hive.sql
Postgres
sudo -u $username pg_dump $databasename > $outputfilename.sql
For example:sudo -u postgres pg_dump hive > /tmp/mydir/backup_hive.sql
sudo -u $username psql $databasename < $inputfilename.sql
For example:sudo -u postgres psql hive < /tmp/mydir/backup_hive.sql
Oracle Connect to the Oracle database using sqlplus
export the database:exp username/password@database full=yes file=output_file.dmp
Import the database: imp username/password@database ile=input_file.dmp
Optional - Backup the Oozie Metastore database.
Note These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions.
Table 17.2. Oozie Metastore Database Backup and Restore
Database Type Backup Restore MySQL
mysqldump $dbname > $outputfilename.sql
For example:mysqldump oozie > /tmp/mydir/backup_oozie.sql
mysql $dbname < $inputfilename.sql
For example:mysql oozie < /tmp/mydir/backup_oozie.sql
Postgres
sudo -u $username pg_dump $databasename > $outputfilename.sql
For example:sudo -u postgres pg_dump oozie > /tmp/mydir/backup_oozie.sql
sudo -u $username psql $databasename < $inputfilename.sql
For example:sudo -u postgres psql oozie < /tmp/mydir/backup_oozie.sql
Stop all services (including MapReduce) and client applications deployed on HDFS using the instructions provided here.
Verify that edit logs in
${dfs.name.dir}/name/current/edits*
are empty. These log files should have only 4 bytes of data, which contain the edit logs version. If the edit logs are not empty, start the existing version NameNode and then shut it down after a new fsimage has been written to disks so that the edit log becomes empty.