Upgrading HDP Manually
Also available as:
PDF
loading table of contents...

Start Hadoop Core

[Warning]Warning

Before you start HDFS on a highly available HDFS cluster, you must start the ZooKeeper service. If you do not start the ZKFC, there can be failures.

To start HDFS, run commands as the $HDFS_USER.

[Note]Note

The su commands in this section use keywords to represent the Service user. For example, "hdfs" is used to represent the HDFS Service user. If you are using another name for your Service users, you will need to substitute your Service user name in each of the su commands.

  1. Replace your configuration after upgrading on all the HDFS nodes. Replace the HDFS template configuration in /etc/hdfs/conf.

  2. If you are upgrading from a highly available HDFS cluster configuration, start all JournalNodes. On each JournalNode host, run the following commands:

    su - hdfs -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start journalnode"

    [Important]Important

    All JournalNodes must be running when performing the upgrade, rollback, or finalization operations. If any JournalNodes are down when running any such operation, the operation fails.

  3. If you are running HDFS on a highly available namenode, you must first start the ZooKeeper service

    [Note]Note

    Perform this step only if you are on a highly available HDFS cluster.

    Run the following command on all NameNode hosts:

    su - hdfs -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start zkfc"
  4. Start the NameNode.

    Because the file system version has now changed you must start the NameNode manually.

    On the active NameNode host, run the following commands:

    su - hdfs -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start namenode -upgrade"

    On a large system, this can take a long time to complete.

    [Note]Note

    Run this command with the -upgrade option only once. After you have completed this step, you can bring up the NameNode using this command without including the -upgrade option.

    [Note]Note

    If you receive the error:

    Failed to find Premain-Class manifest attribute in
          /usr/hdp/<HDP-version>/hadoop/lib/ranger-hdfs-plugin-shim-0.5.0.***.jar
    Error occurred during initialization of VM
    agent library failed to init: instrument

    after an upgrade to HDP-2.3.6, remove set-hdfs-plugin-env.sh from the /usr/hdp/<hdp-version>/hadoop/conf/ directory.

    To check if the Upgrade is in progress, check that the "\previous" directory has been created in the \NameNode and \JournalNode directories. The "\previous" directory contains a snapshot of the data before upgrade.

    In a highly available HDFS cluster configuration, this NameNode will not enter the standby state as usual. Rather, this NameNode will immediately enter the active state, perform an upgrade of its local storage directories, and also perform an upgrade of the shared edit log. At this point, the standby NameNode in the HA pair is still down. It will be out of sync with the upgraded active NameNode.

    To synchronize the active and standby NameNode, re-establishing HA, re-bootstrap the standby NameNode by running the NameNode with the '-bootstrapStandby' flag. Do NOT start this standby NameNode with the '-upgrade' flag.

    su - hdfs -c "hdfs namenode -bootstrapStandby -force"

    The bootstrapStandby command will download the most recent fsimage from the active NameNode into the $dfs.name.dir directory of the standby NameNode. You can enter that directory to make sure the fsimage has been successfully downloaded. After verifying, start the ZKFailoverController, then start the standby NameNode. You can check the status of both NameNodes using the Web UI.

  5. Verify that the NameNode is up and running:

    ps -ef|grep -i NameNode

  6. If you do not have a highly available HDFS cluster configuration (non_HA namenode), start the Secondary NameNode.

    [Note]Note

    Do not perform this step if you have a highly available HDFS cluster configuration.

    On the Secondary NameNode host machine, run the following commands:

    su - hdfs -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start secondarynamenode"

  7. Verify that the Secondary NameNode is up and running.

    [Note]Note

    Do not perform this step if you have a highly available HDFS cluster environment.

    ps -ef|grep SecondaryNameNode

  8. Start DataNodes.

    On each of the DataNodes, enter the following command. Note: If you are working on a non-secure DataNode, use $HDFS_USER. For a secure DataNode, use root.

    su - hdfs -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start datanode"

  9. Verify that the DataNode process is up and running:

    ps -ef|grep DataNode

  10. Verify that NameNode can go out of safe mode.

    >su - hdfs -c "hdfs dfsadmin -safemode wait"

    You should see the following result: Safe mode is OFF

    In general, it takes 5-10 minutes to get out of safemode. For thousands of nodes with millions of data blocks, getting out of safemode can take up to 45 minutes.