3. Complete the Stack Upgrade - Hortonworks Data Platform

Start Ambari Server and Ambari Agents.

On the Server host:

ambari-server start

On all of the Agent hosts:

ambari-agent start

Using the Ambari Web Services view, start the ZooKeeper service.

If you are upgrading from an HA NameNode configuration, start all JournalNodes. On each JournalNode host, run the following command:

su -l <HDFS_USER> -c "/usr/lib/hadoop/sbin/hadoop-daemon.sh start journalnode"

	Important
	All JournalNodes must be running when performing the upgrade, rollback, or finalization operations. If any JournalNodes are down when running any such operation, the operation will fail.

Because the file system version has now changed you must start the NameNode manually. On the active NameNode host:

 su -l <HDFS_USER> -c "export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop/sbin/hadoop-daemon.sh start namenode -upgrade"

To check if the Upgrade is in progress, check that the "\previous" directory has been created in \NameNode and \JournalNode directories. The "\previous" directory contains a snapshot of the data before upgrade.

Note

In a NameNode HA configuration, this NameNode will not enter the standby state as usual. Rather, this NameNode will immediately enter the active state, perform an upgrade of its local storage directories, and also perform an upgrade of the shared edit log. At this point, the standby NameNode in the HA pair is still down. It will be out of sync with the upgraded active NameNode. To synchronize the active and standby NameNode, re-establishing HA, re-bootstrap the standbyNameNode by running the NameNode with the '-bootstrapStandby' flag. Do NOT start this standby NameNode with the '-upgrade' flag.

su -l <HDFS_USER> -c "hdfs namenode -bootstrapStandby -force"

The bootstrapStandby command will download the most recent fsimage from the active NameNode into the $dfs.name.dir directory of the standby NameNode. You can enter that directory to make sure the fsimage has been successfully downloaded. After verifying, start the ZKFailoverController via Ambari, then start the standby NameNode via Ambari. You can check the status of both NameNodes using the Web UI.

Start all DataNodes.

su -l <HDFS_USER> -c "/usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode"

The NameNode will send an upgrade command to DataNodes after receiving block reports.

Prepare the NameNode to work with Ambari:

Open the Ambari Web GUI. If it has been open throughout the process, clear your browser cache, then refresh.
On the Services view, choose HDFS to open the HDFS service.
Restart the HDFS service. Restarting HDFS restarts all NameNodes, DataNodes, and JournalNodes.
Run the Service Check, using Actions > Run Service Check. Makes sure it passes.

After the DataNodes are started, HDFS exits safemode. To monitor the status, run the following command:

sudo su -l <HDFS_USER> -c "hdfs dfsadmin -safemode get"

Depending on the size of your system, a response may not display for up to 10 minutes. When HDFS exits safemode, the following message displays:

Safe mode is OFF

Make sure that the HDFS upgrade was successful. Go through steps 3 and 4 in Preparing for the Upgrade to create new versions of the logs and reports. Substitute "new" for "old" in the file names as necessary.

Compare the old and new versions of the following:

dfs-old-fsck-1.log versus dfs-new-fsck-1.log.
The files should be identical unless the hadoop fsck reporting format has changed in the new version.
dfs-old-lsr-1.log versus dfs-new-lsr-1.log.
The files should be identical unless the format of hadoop fs -lsr reporting or the data structures have changed in the new version.
dfs-old-report-1.log versus fs-new-report-1.log
Make sure all DataNodes previously belonging to the cluster are up and running.

Make the following config changes required for Application Timeline Server. Use the Ambari web UI to navigate to the service dashboard and add/modify the following configurations:

YARN (Custom yarn-site.xml)

yarn.timeline-service.leveldb-timeline-store.path=/var/log/hadoop-yarn/timeline
yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms=300000
** If you are upgrading to HDP 2.1.3, use the following setting: yarn.timeline-service.store-class=org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore**
** If you are upgrading to HDP 2.1.2, use the following setting: yarn.timeline-service.store-class=org.apache.hadoop.yarn.server.applicationhistoryservice.timeline.LeveldbTimelineStore **
yarn.timeline-service.ttl-enable=true
yarn.timeline-service.ttl-ms=2678400000
yarn.timeline-service.generic-application-history.store-class=org.apache.hadoop.yarn.server.applicationhistoryservice.NullApplicationHistoryStore
yarn.timeline-service.webapp.address=<PUT_THE_FQDN_OF_ATS_HOST_NAME_HERE>:8188
yarn.timeline-service.webapp.https.address=<PUT_THE_FQDN_OF_ATS_HOST_NAME_HERE>:8190
yarn.timeline-service.address=<PUT_THE_FQDN_OF_ATS_HOST_NAME_HERE>:10200

HIVE (hive-site.xml)
hive.execution.engine=mr
hive.exec.failure.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook
hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook
hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook
hive.tez.container.size=<map-container-size>
        *If mapreduce.map.memory.mb > 2GB then set it equal to mapreduce.map.memory. Otherwise, set it equal to mapreduce.reduce.memory.mb*
hive.tez.java.opts="-server -Xmx" + Math.round(0.8 * map-container-size) + "m -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC"

Use the Ambari Web Services view to start YARN.

Use the Ambari Web Services view to start MapReduce2.

Use the Ambari Web Services view to start HBase and ensure the service check passes.

Using Ambari Web, navigate to Services > Hive > Configs > Advanced and verify that the following properties are set to their default values:

Hive (Advanced)
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator

	Note
	The Security Wizard enables Hive authorization. The default values for these properties changed in Hive-0.12. If you are upgrading Hive from 0.12 to 0.13 in a secure cluster, you should not need to change the values. If upgrading from Hive-older than version 0.12 to Hive-0.12 or greater in a secure cluster, you will need to correct the values.

If YARN is installed in your HDP 2.0 stack, and the Application Timeline Server (ATS) components are not , then you must create and install ATS service and host components via API by running the following commands on the server that will host the YARN application timeline server in your cluster. Be sure to replace <your_ambari_server_host>, <your_cluster_name>, and <your_ATS_component_hostname> with host names appropriate for your envrionment.

	Note
	Ambari does not currently support ATS in a kerberized cluster. If you are upgrading YARN in a kerberized cluster, skip this step.

Create the ATS Service Component.

curl --user admin:admin -H "X-Requested-By: ambari" -i -X POST  http://<your_ambari_server_host>:8080/api/v1/clusters/<your_cluster_name>/services/YARN/components/APP_TIMELINE_SERVER

Create the ATS Host Component.

curl --user admin:admin -H "X-Requested-By: ambari" -i -X POST http://<your_ambari_server_host>:8080/api/v1/clusters/<your_cluster_name>/hosts/<your_ATS_component_hostname>/host_components/APP_TIMELINE_SERVER

Install the ATS Host Component.

curl --user admin:admin -H "X-Requested-By: ambari" -i -X PUT -d '{ "HostRoles": { "state":  "INSTALLED"}}' http://<your_ambari_server_host>:8080/api/v1/clusters/<your_cluster_name>/hosts/<your_ATS_component_hostname>/host_components/APP_TIMELINE_SERVER

	Note
	curl commands use the default username/password = admin/admin. To run the curl commands using non-default credentials, modify the --user option to use your Ambari administrator credentials. For example: --user <ambari_admin_username>:<ambari_admin_password> .

Using Ambari Web > Services > Service Actions, start YARN.

Using Ambari Web > Services > Service Actions, start MapReduce2.

Using Ambari Web > Services > Service Actions, start HBase and ensure the service check passes.

Upgrade Oozie.

	Note
	You must replace your Oozie configuration after upgrading.

Perform the following preparation steps on each oozie server host:
1. Copy files from the backup folder /conf to /etc/oozie/conf directory.
```
cp <oozie-conf-bak>/oozie-site.xml /etc/oozie/conf
```
```
cp <oozie-conf-bak>/oozie-env.sh /etc/oozie/conf/oozie-env.sh
```
```
chmod -R 777 /etc/alternatives/oozie-tomcat-conf/conf
```
```
rm -rf /usr/lib/oozie/conf
```
```
ln -s /etc/oozie/conf /usr/lib/oozie/conf
```
2. Create /usr/lib/oozie/libext-upgrade21 directory.
```
mkdir /usr/lib/oozie/libext-upgrade21
```
3. Copy the JDBC jar of your Oozie database to both /usr/lib/oozie/libext-upgrade21 and /usr/lib/oozie/libtools.
  For example, if you are using MySQL, copy your mysql-connector-java.jar.
4. Copy these files to /usr/lib/oozie/libext-upgrade21 directory
```
cp /usr/lib/hadoop/lib/hadoop-lzo*.jar /usr/lib/oozie/libext-upgrade21
```
```
cp /usr/share/HDP-oozie/ext-2.2.zip /usr/lib/oozie/libext-upgrade21
```
5. Grant read/write access to the Oozie user.
```
chmod -R 777 /usr/lib/oozie/libext-upgrade21
```

Upgrade steps:

On the Services view, make sure YARN and MapReduce2 are running.
Make sure that the Oozie service is stopped.
Upgrade Oozie. You must be the Oozie service user. On the Oozie server host:
```
sudo su -l <OOZIE_USER> -c"/usr/lib/oozie/bin/ooziedb.sh upgrade -run"
```
Make sure that the output contains the string "Oozie DB has been upgraded to Oozie version <OOZIE Build Version>.

Prepare the Oozie WAR file, run as root:

	Note
	The Oozie server must be not running for this step. If you get the message "ERROR: Stop Oozie first", it means the script still thinks it's running. Check, and if needed, remove the process id (pid) file indicated in the output.

sudo su -l <OOZIE_USER> -c "/usr/lib/oozie/bin/oozie-setup.sh prepare-war -d /usr/lib/oozie/libext-upgrade21"

Make sure that the output contains the string "New Oozie WAR file added".

Using Ambari Web UI Services > Oozie > Configs, expand Advanced, then edit the following properties:

In oozie.service.coord.push.check.requeue.interval, replace the existing property value with the following one:
```
30000
```

In oozie.service.SchemaService.wf.ext.schemas, append (using copy/paste) to the existing property value the following string:

shell-action-0.2.xsd,oozie-sla-0.1.xsd,oozie-sla-0.2.xsd,hive-action-0.3.xsd

Note

If you have customized schemas, append this string to your custom schema name string.

Do not overwrite custom schemas.

If you have no customized schemas, you can replace the existing string with the following one:

shell-action-0.1.xsd,email-action-0.1.xsd,hive-action-0.2.xsd,sqoop-action-0.2.xsd,ssh-action-0.1.xsd,distcp-action-0.1.xsd,shell-action-0.2.xsd,oozie-sla-0.1.xsd,oozie-sla-0.2.xsd,hive-action-0.3.xsd

In oozie.service.URIHandlerService.uri.handlers, append to the existing property value the following string:
```
org.apache.oozie.dependency.FSURIHandler,org.apache.oozie.dependency.HCatURIHandler
```

In oozie.services, append to the existing property value the following string:

org.apache.oozie.service.XLogStreamingService,org.apache.oozie.service.JobsConcurrencyService

Note

If you have customized properties, append this string to your custom property value string.

Do not overwrite custom properties.

If you have no customized properties, you can replace the existing string with the following one:

org.apache.oozie.service.SchedulerService, org.apache.oozie.service.InstrumentationService, org.apache.oozie.service.CallableQueueService, org.apache.oozie.service.UUIDService, org.apache.oozie.service.ELService, org.apache.oozie.service.AuthorizationService, org.apache.oozie.service.UserGroupInformationService, org.apache.oozie.service.HadoopAccessorService, org.apache.oozie.service.URIHandlerService, org.apache.oozie.service.MemoryLocksService, org.apache.oozie.service.DagXLogInfoService, org.apache.oozie.service.SchemaService, org.apache.oozie.service.LiteWorkflowAppService, org.apache.oozie.service.JPAService, org.apache.oozie.service.StoreService, org.apache.oozie.service.CoordinatorStoreService, org.apache.oozie.service.SLAStoreService, org.apache.oozie.service.DBLiteWorkflowStoreService, org.apache.oozie.service.CallbackService, org.apache.oozie.service.ActionService, org.apache.oozie.service.ActionCheckerService, org.apache.oozie.service.RecoveryService, org.apache.oozie.service.PurgeService, org.apache.oozie.service.CoordinatorEngineService, org.apache.oozie.service.BundleEngineService, org.apache.oozie.service.DagEngineService, org.apache.oozie.service.CoordMaterializeTriggerService, org.apache.oozie.service.StatusTransitService, org.apache.oozie.service.PauseTransitService, org.apache.oozie.service.GroupsService, org.apache.oozie.service.ProxyUserService, org.apache.oozie.service.XLogStreamingService, org.apache.oozie.service.JobsConcurrencyService

In oozie.services.ext, append to the existing property value the following string:
```
org.apache.oozie.service.PartitionDependencyManagerService,org.apache.oozie.service.HCatAccessorService
```
After modifying all properties on the Oozie Configs page, scroll down, then choose Save to update oozie.site.xml, using the modified configurations.

Replace the content of /usr/oozie/share in HDFS. On the Oozie server host:
1. Extract the Oozie sharelib into a tmp folder.
```
mkdir -p /tmp/oozie_tmp
cp /usr/lib/oozie/oozie-sharelib.tar.gz /tmp/oozie_tmp
cd /tmp/oozie_tmp
tar xzvf oozie-sharelib.tar.gz
```
2. Back up the/usr/oozie/share folder in HDFS and then delete it. If you have any custom files in this folder back them up separately and then add them back after the share folder is updated.
```
mkdir /tmp/oozie_tmp/oozie_share_backup
chmod 777 /tmp/oozie_tmp/oozie_share_backup
```
```
su -l <HDFS_USER> -c "hdfs dfs -copyToLocal /user/oozie/share /tmp/oozie_tmp/oozie_share_backup"
su -l <HDFS_USER> -c "hdfs dfs -rm -r /user/oozie/share"
```
3. Add the latest share libs that you extracted in step 1. After you have added the files, modify ownership and acl.
```
su -l <HDFS_USER> -c "hdfs dfs -copyFromLocal /tmp/oozie_tmp/share /user/oozie/."
su -l <HDFS_USER> -c "hdfs dfs -chown -R <OOZIE_USER>:<HADOOP_GROUP> /user/oozie" 
su -l <HDFS_USER> -c "hdfs dfs -chmod -R 755 /user/oozie"
```
Use the Services view to start the Oozie service. Make sure that ServiceCheck passes for Oozie.

Update WebHCat.

Modify the webhcat-site config type.

Using the Ambari web UI, navigate to Services > WebHCat and modify the following configuration:

Table 2.3. WebHCat Properties to Modify
Action	Property Name	Property Value
Modify	templeton.storage.class	`org.apache.hive.hcatalog.templeton.tool.ZooKeeperStorage`

Update the Pig and Hive tar bundles, by updating the following files:
- /apps/webhcat/pig.tar.gz
- /apps/webhcat/hive.tar.gz
  Note
  You will find these files on a host where webhcat is installed.
For example, to update a *.tar.gz file:
1. Move the file to a local directory.
```
su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -copyToLocal /apps/webhcat/*.tar.gz $<local_backup_dir>"
```
2. Remove the old file.
```
su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -rm /apps/webhcat/*.tar.gz"
```
3. Copy the new file.
```
su -l <HCAT_USER> -c "hdfs --config /etc/hadoop/conf dfs -copyFromLocal /usr/share/HDP-webhcat/*.tar.gz /apps/webhcat/"
```

On each WebHCat host, update /app/webhcat/hadoop-streaming.jar file.

Move the file to a local directory.

su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -copyToLocal /apps/webhcat/hadoop-streaming*.jar $<local_backup_dir>"

Remove the old file.

su -l <HCAT_USER> -c "hadoop --config /etc/hadoop/conf fs -rm /apps/webhcat/hadoop-streaming*.jar"

Copy the new hadoop-streaming.jar file.

su -l <HCAT_USER> -c "hdfs --config /etc/hadoop/conf dfs -copyFromLocal /usr/lib/hadoop-mapreduce/hadoop-streaming*.jar /apps/webhcat"

Upgrade Flume.

Make a backup copy of the current Flume configuration files, on each Flume host.
```
cd /etc/flume/conf
cp flume.conf <flume-conf-backup>/flume.conf 
```
Note
More than one Flume configuration file may exist. Make a backup copy of each one.

Execute the following commands on each Flume host:

For RHEL/CentOS/Oracle Linux:
```
yum upgrade flume
```

For SLES:

zypper upgrade flume
zypper remove flume
zypper se -s flume

You should see Flume in the output. Then, install Flume.

zypper install flume

	Important
	When removing and installing packages, rename files in the `/conf` directory that have `.rpmsave` extension to original to retain the customized configurations. Alternatively, use the configuration files in the `/conf` directory that you backed up before upgrading.

Verify that Flume was upgraded correctly by starting a basic example. Flume does not start running immediately after installation/upgrade, by default. To validate the Flume upgrade:
1. Replace your default conf/flume.conf with the following flume.conf file:
```
1. Name the components on this agent
    a1.sources = r1 
    a1.sinks = k1 
    a1.channels = c1
2.Describe/configure the source
    a1.sources.r1.type = seq
3. Describe the sink
    a1.sinks.k1.type = file_roll 
    a1.sinks.k1.channel = c1 
    a1.sinks.k1.sink.directory = /tmp/flume
4. Use a channel which buffers events in memory
    a1.channels.c1.type = memory
5. Bind the source and sink to the channel
    a1.sources.r1.channels = c1 
    a1.sinks.k1.channel = c1  
```
2. Restart Flume, using Ambari Web.
3. To verify that data is flowing, examine /tmp/flume to see that any files exist. These files should contain simple, sequential numbers.
4. Stop Flume.
Copy the backup Flume configuration files you created in Step 17.a back into their original location.
```
cd /etc/flume/conf
cp <flume-conf-backup>/flume.conf
```

If you use Tez as the Hive execution engine, and if the variable hive.server2.enabled.doAs is set to true, you must create a scratch directory on the NameNode host for the username that will run the HiveServer2 service. For example, use the following commands:

sudo su -c "hdfs -makedir /tmp/hive-<username>"

sudo su -c "hdfs -chmod 777 /tmp/hive-<username>"

where <username> is the name of the user that runs the HiveServer2 service.

If you use Hue to manage your Stack, you must upgrade the Hue component manaully. For specific upgrade steps, see the instructions to Upgrade Hue.

If you use Mahout, you must upgrade the Mahout component manaully. For specific upgrade steps, see the instructions to Upgrade Mahout.

Using Ambari Web > Services, re-start the remaining services.

The upgrade is now fully functional but not yet finalized. Using the finalize command removes the previous version of the NameNode and DataNode storage directories.

	Important
	After the upgrade is finalized, the system cannot be rolled back. Usually this step is not taken until a thorough testing of the upgrade has been performed.

The upgrade must be finalized before another upgrade can be performed.

	Note
	Directories used by Hadoop 1 services set in /etc/hadoop/conf/taskcontroller.cfg are not automatically deleted after upgrade. Administrators can choose to delete these directories after the upgrade.

To finalize the upgrade, execute the following command once, on the primary NameNode host in your HDP cluster:

sudo su -l <HDFS_USER> -c "hdfs dfsadmin -finalizeUpgrade"

	Note
	You will find these files on a host where webhcat is installed.

	Note
	More than one Flume configuration file may exist. Make a backup copy of each one.