1. Preparing for the Upgrade - Hortonworks Data Platform

If you are upgrading Ambari as well as the stack, you must know the location of the Nagios servers for that process. Use the Services->Nagios-> Summary panel to locate the hosts on which they are running.

IF the Oozie service is installed in your cluster, list all current jobs.

oozie jobs -oozie http://localhost:11000/oozie -len 100 -filter status=RUNNING

Stop all jobs in a RUNNING or SUSPENDED state on your Oozie server host. For example:

oozie job -oozie {your.oozie.server.host}:11000/oozie -kill {oozie.job.id}

Use the Services view on the Ambari Web UI to stop all services except HDFS and ZooKeeper. Also stop any client programs that access HDFS.

Finalize any prior upgrade, if you have not done so already.

su {HDFSUSER}
hadoop dfsadmin -finalizeUpgrade

You can check the namenode directory to ensure that there is no snapshot of any prior HDFS upgrade. In particular, look into the directory $dfs.namenode.name.dir (or $dfs.name.dir) on the NameNode. Make sure there is only a ‘current’ directory and no ‘previous’ directory there.

Create the following logs and other files.

Creating these logs allows you to check the integrity of the file system post upgrade.

Run fsck with the following flags and send the results to a log. The resulting file contains a complete block map of the file system. You use this log later to confirm the upgrade.
```
hadoop fsck / -files -blocks -locations > dfs-old-fsck-1.log
```
Optional: Capture the complete namespace of the filesystem. (The following command does a recursive listing of the root file system.)
```
hadoop dfs -ls -R / > dfs-old-lsr-1.log 
```
Create a list of all the DataNodes in the cluster.
```
hadoop dfsadmin -report > dfs-old-report-1.log
```
Optional: copy all or unrecoverable only data stored in HDFS to a local file system or to a backup instance of HDFS.

Save the namespace. You must be the HDFS service user to do this and you must put the cluster in Safe Mode.

	Important
	This is a critical step. If you do not do this step before you do the upgrade, the NameNode will not start afterwards.

hadoop dfsadmin -safemode enter
hadoop dfsadmin -saveNamespace

Note

In a HA NameNode configuration, the command hdfs dfsadmin -saveNamespace does checkpoint in the first NameNode specified in the configuration, in dfs.ha.namenodes.[nameservice ID]. You can also use the dfsadmin -fs option to specify which NameNode to connect. For example, to force a checkpoint in namenode 2:

hdfs dfsadmin -fs hdfs://namenode2-hostname:namenode2-port -saveNamespace

Copy the following checkpoint files into a backup directory. You can find the directory by using the Services View in the UI. Select the HDFS service, the Configs tab, in the Namenode section, look up the property NameNode Directories. It will be on your NameNode host.

$dfs.name.dir/current

	Note
	In a HA NameNode configuration, the location of the checkpoint depends on where the saveNamespace command is sent, as defined in the preceding step.

Store the layoutVersion for the NameNode. Make a copy of the file at $dfs.name.dir/current/VERSION where $dfs.name.dir is the value of the config parameter NameNode directories. This file will be used later to verify that the layout version is upgraded.

Stop HDFS. Make sure all services in the cluster are completely stopped.

If you are upgrading Hive and Oozie, back up the Hive database and the Oozie database on the Hive database host and Oozie database host machines, respectively.

Optional - Backup the Hive Metastore database.

	Note
	These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions.

Table 3.1. Hive Metastore Database Backup and Rstore

Database Type	Backup	Restore
MySQL	`mysqldump $dbname > $outputfilename.sql` For example: `mysqldump hive > /tmp/mydir/backup_hive.sql`	`mysql $dbname < $inputfilename.sql` For example: `mysql hive < /tmp/mydir/backup_hive.sql`
Postgres	`sudo -u $username pg_dump $databasename > $outputfilename.sql`For example: `sudo -u postgres pg_dump hive > /tmp/mydir/backup_hive.sql`	`sudo -u $username psql $databasename < $inputfilename.sql` For example: `sudo -u postgres psql hive < /tmp/mydir/backup_hive.sql`
Oracle	Connect to the Oracle database using `sqlplus` export the database: `exp username/password@database full=yes file=output_file.dmp`	Import the database: `imp username/password@database ile=input_file.dmp`

Optional - Backup the Oozie Metastore database.

	Note
	These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions.

Table 3.2. Oozie Metastore Database Backup and Restore

Database Type	Backup	Restore
MySQL	`mysqldump $dbname > $outputfilename.sql` For example: `mysqldump oozie > /tmp/mydir/backup_oozie.sql`	`mysql $dbname < $inputfilename.sql` For example: `mysql oozie < /tmp/mydir/backup_oozie.sql`
Postgres	`sudo -u $username pg_dump $databasename > $outputfilename.sql`For example: `sudo -u postgres pg_dump oozie > /tmp/mydir/backup_oozie.sql`	`sudo -u $username psql $databasename < $inputfilename.sql` For example: `sudo -u postgres psql oozie < /tmp/mydir/backup_oozie.sql`

On the Ambari Server host, stop Ambari Server and confirm that it is stopped:

ambari-server stop

ambari-server status

Stop all Ambari Agents. On every host in your cluster known to Ambari:

ambari-agent stop