This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Upgrading HBase

  Note:

To see which version of HBase is shipping in CDH 5, check the Version and Packaging Information. For important information on new and changed components, see the CDH 5 Release Notes.

  Important:

Before you start, make sure you have read and understood the previous section, New Features and Changes for HBase in CDH 5, and check the Known Issues and Work Arounds in CDH 5 and Incompatible Changes for HBase .

Coprocessors and Custom JARs

When upgrading HBase from one major version to another (such as moving from CDH 4 to CDH 5), you must recompile coprocessors and custom JARs after the upgrade.

Upgrading HBase from CDH 4 to CDH 5

CDH 5.0 HBase is based on Apache HBase 0.96.1.1 Remember that once a cluster has been upgraded to CDH 5, it cannot be reverted to CDH 4. To ensure a smooth upgrade, this section guides you through the steps involved in upgrading HBase from the older CDH 4.x releases to CDH 5.

These instructions also apply to upgrading HBase from CDH 4.x directly to CDH 5.1.0, which is a supported path.

Prerequisites

HDFS and ZooKeeper should be available while upgrading HBase.

Overview of Upgrade Procedure

Before you can upgrade HBase from CDH 4 to CDH 5, your HFiles must be upgraded from HFile v1 format to HFile v2, because CDH 5 no longer supports HFile v1. The upgrade procedure itself is different if you are using Cloudera Manager or the command line, but has the same results. The first step is to check for instances of HFile v1 in the HFiles and mark them to be upgraded to HFile v2, and to check for and report about corrupted files or files with unknown versions, which need to be removed manually. The next step is to rewrite the HFiles during the next major compaction. After the HFiles are upgraded, you can continue the upgrade.

Upgrade HBase Using the Command Line

CDH 5 comes with an upgrade script for HBase. You can run bin/hbase --upgrade to see its Help section. The script runs in two modes: -check and -execute.

Step 1: Check for HFile v1 files and compact if necessary

  1. Run the upgrade command in -check mode, and examine the output.
    $ bin/hbase upgrade -check
    Your output should be similar to the following:
    Tables Processed:
    hdfs://localhost:41020/myHBase/.META.
    hdfs://localhost:41020/myHBase/usertable
    hdfs://localhost:41020/myHBase/TestTable
    hdfs://localhost:41020/myHBase/t
    
    Count of HFileV1: 2
    HFileV1:
    hdfs://localhost:41020/myHBase/usertable /fa02dac1f38d03577bd0f7e666f12812/family/249450144068442524
    hdfs://localhost:41020/myHBase/usertable /ecdd3eaee2d2fcf8184ac025555bb2af/family/249450144068442512
    
    Count of corrupted files: 1
    Corrupted Files:
    hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812/family/1
    Count of Regions with HFileV1: 2
    Regions to Major Compact:
    hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812
    hdfs://localhost:41020/myHBase/usertable/ecdd3eaee2d2fcf8184ac025555bb2af
    In the example above, you can see that the script has detected two HFile v1 files, one corrupt file and the regions to major compact.

    By default, the script scans the root directory, as defined by hbase.rootdir. To scan a specific directory, use the --dir option. For example, the following command scans the /myHBase/testTable directory.

    bin/hbase upgrade --check --dir /myHBase/testTable
  2. Trigger a major compaction on each of the reported regions. This major compaction rewrites the files from HFile v1 to HFile v2 format. To run the major compaction, start HBase Shell and issue the major_compact command.
    $ bin/hbase shell
    hbase> major_compact 'usertable'
    You can also do this in a single step by using the echo shell built-in command.
    $ echo "major_compact 'usertable'" | bin/hbase shell
  3. Once all the HFileV1 files have been rewritten, running the upgrade script with the -check option again will return a "No HFile v1 found" message. It is then safe to proceed with the upgrade.

Step 2: Gracefully shutdown CDH 4 HBase cluster

Shutdown your CDH 4 HBase cluster before you run the upgrade script in -execute mode.

To shutdown HBase gracefully:

  1. Stop the REST and Thrift server and clients, then stop the cluster.
    1. Stop the Thrift server and clients:
      sudo service hbase-thrift stop
      Stop the REST server:
      sudo service hbase-rest stop
    2. Stop the cluster by shutting down the master and the region servers:
      1. Use the following command on the master node:
        sudo service hbase-master stop
      2. Use the following command on each node hosting a region server:
        sudo service hbase-regionserver stop
  2. Stop the ZooKeeper Server:
    $ sudo service zookeeper-server stop

Step 4: Run the HBase upgrade script in -execute mode

  Important: Before you proceed with Step 4, upgrade your CDH 4 cluster to CDH 5. See Upgrading to CDH 5 for instructions.

This step executes the actual upgrade process. It has a verification step which checks whether or not the Master, RegionServer and backup Master znodes have expired. If not, the upgrade is aborted. This ensures no upgrade occurs while an HBase process is still running. If your upgrade is aborted even after shutting down the HBase cluster, retry after some time to let the znodes expire. Default znode expiry time is 300 seconds.

As mentioned earlier, ZooKeeper and HDFS should be available. If ZooKeeper is managed by HBase, then use the following command to start ZooKeeper.

./hbase/bin/hbase-daemon.sh start zookeeper

The upgrade involves three steps:

  • Upgrade Namespace: This step upgrades the directory layout of HBase files.
  • Upgrade Znodes: This step upgrades /hbase/replication (znodes corresponding to peers, log queues and so on) and table znodes (keep table enable/disable information). It deletes other znodes.
  • Log Splitting: In case the shutdown was not clean, there might be some Write Ahead Logs (WALs) to split. This step does the log splitting of such WAL files. It is executed in a “non distributed mode”, which could make the upgrade process longer in case there are too many logs to split. To expedite the upgrade, ensure you have completed a clean shutdown.
Run the upgrade command in -execute mode.
$ bin/hbase upgrade -execute

Your output should be similar to the following:

Starting Namespace upgrade
Created version file at hdfs://localhost:41020/myHBase with version=7
Migrating table testTable to hdfs://localhost:41020/myHBase/.data/default/testTable
…..
Created version file at hdfs://localhost:41020/myHBase with version=8
Successfully completed NameSpace upgrade.
Starting Znode upgrade
….
Successfully completed Znode upgrade
Starting Log splitting
…
Successfully completed Log splitting

The output of the -execute command can either return a success message as in the example above, or, in case of a clean shutdown where no log splitting is required, the command would return a "No log directories to split, returning" message. Either of those messages indicates your upgrade was successful.

  Important: Configuration files
  • If you install a newer version of a package that is already on the system, configuration files that you have modified will remain intact.
  • If you uninstall a package, the package manager renames any configuration files you have modified from <file> to <file>.rpmsave. If you then re-install the package (probably to install a new version) the package manager creates a new <file> with applicable defaults. You are responsible for applying any changes captured in the original configuration file to the new configuration file. In the case of Ubuntu and Debian upgrades, you will be prompted if you have made changes to a file for which there is a new version; for details, see Automatic handling of configuration files by dpkg.

Step 5 (Optional): Move Tables to Namespaces

CDH 5 introduces namespaces for HBase tables. As a result of the upgrade, all tables are automatically assigned to namespaces. The root, meta, and acl tables are added to the hbase system namespace. All other tables are assigned to the default namespace.

To move a table to a different namespace, take a snapshot of the table and clone it to the new namespace. After the upgrade, do the snapshot and clone operations before turning the modified application back on.

  Warning: Do not move datafiles manually, as this can cause data corruption that requires manual intervention to fix.

Step 6: Recompile custom coprocessors and JARs.

Recompile any coprocessors and custom JARs, so that they will work with the new version of HBase.

FAQ

In order to prevent upgrade failures because of unexpired znodes, is there a way to check/force this before an upgrade?

The upgrade script "executes" the upgrade when it is run with the -execute option. As part of the first step, it checks for any live HBase processes (RegionServer, Master and backup Master), by looking at their znodes. If any such znode is still up, it aborts the upgrade and prompts the user to stop such processes, and wait until their znodes have expired. This can be considered an inbuilt check.

The -check option has a different use case: To check for HFile v1 files. This option is to be run on live CDH 4 clusters to detect HFile v1 and major compact any regions with such files.

What are the steps for Cloudera Manager to do the upgrade?

See Upgrading CDH in a Cloudera Manager Deployment in the Cloudera Manager documentation.

Upgrading HBase from an Earlier CDH 5 Release

  Important: Rolling upgrade is not supported between a CDH 5 Beta release and this CDH 5 GA release. Cloudera recommends using Cloudera Manager if you need to do rolling upgrades.

To upgrade HBase from an earlier CDH 5 release, proceed as follows.

The instructions that follow assume that you are upgrading HBase as part of an upgrade to the latest CDH 5 release, and have already performed the steps under Upgrading from a CDH 5 Beta Release to the Latest Version .

Step 1: Perform a Graceful Cluster Shutdown

  Note:

Upgrading via rolling restart is not supported.

To shut HBase down gracefully:

  1. Stop the Thrift server and clients, then stop the cluster.
    1. Stop the Thrift server and clients:
      sudo service hbase-thrift stop
    2. Stop the cluster by shutting down the master and the region servers:
      1. Use the following command on the master node:
        sudo service hbase-master stop
      2. Use the following command on each node hosting a region server:
        sudo service hbase-regionserver stop
  2. Stop the ZooKeeper Server:
    $ sudo service zookeeper-server stop

Step 2: Install the new version of HBase

  Note:

You may want to take this opportunity to upgrade ZooKeeper, but you do not have to upgrade Zookeeper before upgrading HBase; the new version of HBase will run with the older version of Zookeeper. For instructions on upgrading ZooKeeper, see Upgrading ZooKeeper from an Earlier CDH 5 Release.

It is a good idea to back up the /hbase znode before proceeding. By default, this is in /var/lib/zookeeper.

To install the new version of HBase, follow directions in the next section, Installing HBase.

  Important: Configuration files
  • If you install a newer version of a package that is already on the system, configuration files that you have modified will remain intact.
  • If you uninstall a package, the package manager renames any configuration files you have modified from <file> to <file>.rpmsave. If you then re-install the package (probably to install a new version) the package manager creates a new <file> with applicable defaults. You are responsible for applying any changes captured in the original configuration file to the new configuration file. In the case of Ubuntu and Debian upgrades, you will be prompted if you have made changes to a file for which there is a new version; for details, see Automatic handling of configuration files by dpkg.
Page generated September 3, 2015.