Apache HBase Incompatible Changes and Limitations

Compatibility Notes for CDH 5

This section contains information that is relevant for all releases within the CDH 5 family. See the sections below for information which pertains to specific releases within CDH 5. If you are upgrading through more than one version (for instance, from CDH 5.0 to CDH 5.2), read the sections for each version, as most of the information listed applies to the given version and newer releases.

General Notes

  • FilterBase no longer implements Writable. This means that you do not need to implement readFields() and write() methods when writing your own custom fields. Instead, put this logic into the toByteArray and parseFrom methods. See this page for an example.
  • The default number of retained cell versions is reduced from 3 to 1. To increase the number of versions, you can specify the VERSIONS option at table creation or by altering existing tables. Starting with CDH 5.2, you can specify a global default number of versions, which will be applied to all newly created tables where the number of versions is not otherwise specified, by setting hbase.column.max.version to the desired number of versions in hbase-site.xml.
  • In CDH 5 prior to 5.1.3, a Put submitted with a KeyValue, KeyValue.Type.Delete does not delete the cell. In CDH 5.1.3, this behavior is changed, so that a Put submitted with a KeyValue, KeyValue.Type.Delete does delete the cell. This fix is provided in HBASE-11788.
  • Cloudera does not provide support for user-provided custom coprocessors of any kind.

Operator API Changes

  • HBASE-6553 - Removed Avro Gateway. This feature was less robust and not used as much as the Thrift gateways. It has been removed upstream.
  • HBase provides a metrics framework based on JMX beans. Between HBase 0.94 and 0.96, the metrics framework underwent many changes. Some beans were added and removed, some metrics were moved from one bean to another, and some metrics were renamed or removed. Click here to download the CSV spreadsheet which provides a mapping.

User API Changes

  • The HBase User API (Get, Put, Result, Scanner etc; see Apache HBase API documentation) has evolved and attempts have been made to make sure the HBase Clients are source code compatible and thus should recompile without needing any source code modifications. This cannot be guaranteed however, since with the conversion to ProtoBufs, some relatively obscure APIs have been removed. Rudimentary efforts have also been made to preserve recompile compatibility with advanced APIs such as Filters and Coprocessors. These advanced APIs are still evolving and our guarantees for API compatibility are weaker here.
  • As of 0.96, the User API has been marked and all attempts at compatibility in future versions will be made. A version of the javadoc that only contains the User API can be found here.
  • Other changes to CDH 5 HBase that require the upgrade include:
    • HBASE-8015: The HBase Namespaces feature has changed HBase HDFS file layout.
    • HBASE-4451: Renamed ZooKeeper nodes.
    • HBASE-8352: HBase snapshots are now saved to the /<hbase>/.hbase-snapshot dir instead of the /.snapshot dir. This should be handled before upgrading HDFS.
    • HBASE-7660: Removed support for HFile V1. All internal HBase files in the HFile v1 format must be converted to the HFile v2 format.
    • HBASE-6170/HBASE-8909 - The hbase.regionserver.lease.period configuration parameter has been deprecated. Use hbase.client.scanner.timeout.period instead.

Compatibility Notes for CDH 5.15

The Admin interface has added methods for querying maintenance status within the cluster around split and merges. Downstream users who implement their own version of this interface will need to update their source code to account for the new methods. Downstream users who merely make use of Admin instances returned from the HBase API are not impacted.

Compatibility Notes for CDH 5.13

As a part of providing independent timeout tuning for reads and writes in HBASE-15866, the Table interface has added getter and setter methods. Downstream users who implement their own version of this interface will need to update their source code to include these methods. Downstream users who merely make use of Table instances returned from the HBase API are not impacted.

Compatibility Notes for CDH 5.12

There is a new getTime metric, - *_region_*_metric_getTime, that shows the time spent in gets, in milliseconds.

Compatibility Notes for CDH 5.9

  • The default RPC scheduler has been changed from 'deadline' to 'fifo'. To reenable 'deadline', set hbase.ipc.server.callqueue.type to deadline in the hbase-site.xml file.
  • Apache HBase no longer includes XSS defense or encoding for filters. Due to licensing issues, HBase no longer includes a prior XSS defense nor an encoding for filters. Additionally, several dependencies have been removed. Downstream users relying on transitive inclusion of the following will need to directly rely on the appropriate dependency themselves: jsr305 (from the FindBugs project), Apache Commons Fileupload, nekohtml, beanshell core, Apache xml graphics, OWASP antisamy, OWASP esapi, Xalan, Apache Xerces, and Xom.

Compatibility Notes for CDH 5.8

  • HBase now ensures the jsr305 implementation from the findbugs project is not included in its binary artifacts or the compile / runtime dependencies of its user facing modules. Downstream users that rely on this jar will need to update their dependencies.
  • HBase no longer includes Xerces implementation jars that were previously included via transitive dependencies. Downstream users relying on HBase for these artifacts will need to update their dependencies.
  • This issue reverts fixes designed to prevent malicious content from rendering in HBase's UIs. Specifically, these changes shipped in 1.1.4+ and 1.2.0+. They were removed due to licensing issues discovered in the dependencies they introduced. Their implementation and those dependencies have been removed from HBase! Removal of these dependencies is against the strict definition of our version compatibility guidelines. However, inclusion of non-Apache approved licenses cannot be tolerated. Implementation of these fixes using an Apache-appropriate means is tracked in HBASE-16328.

Compatibility Notes for CDH 5.7

  • Cloudera recommends not using the new advanced configuration option hbase.regionserver.hostname, added in HBase 1.2 (CDH 5.7.0), which allows you to specify a separate external-facing hostname for a RegionServer.

Compatibility Notes for CDH 5.4

  • The ports used by Apache HBase 1.0 changed from the 600XX range to the 160XX range. HBase in CDH reverted the change, and continues to use the 600XX port range, to maintain compatibility.
  • If you used visibility labels prior to CDH 5.4 and assigned superuser privileges to HBase users by adding the system label to their set of labels, these users will no longer be superusers in CDH 5.4. To be sure that cached credentials are cleared, use the HBase Shell command clear_auths <username>, for each affected user. To grant users superuser privileges, add them to the HBase Superusers group in Cloudera Manager, or add them to the hbase.superuser property in hbase-site.xml, and restart the HMaster.
  • HTrace is experimental in CDH 5.4.0. Artifacts and package names cannot be relied upon.
  • Jersey was updated from 1.8 to 1.9. This has the following implications.
    • The Jersey version is now consistent with Apache HBase and other CDH components.
    • If your project relies upon jersey-server, you may need to make modifications.
  • Curator in Hadoop was updated from 2.6.0 to 2.7.1. This has the following implications for HBase.
    • PathUtils.validatePath(String) changed return types, which will cause runtime errors for code compiled against the older version.
    • The SharedCountReader and SharedValueReader interfaces each added a method, which will cause compilation errors for code made to use the old version.
  • commons-codec was upgraded from 1.7 to 1.9. This has the following implications for HBase.
    • The class org.apache.commons.codec.net.QuotedPrintableCodec has a constructor that throws additional exceptions. See the API reference for details.
  • commons-logging was updated from version 1.1.1. to 1.2. This has the following implications for HBase.
    • org.apache.commons.logging.LogSource.setLogImplementation(String) no longer throws ExceptionInInitializerError, which may change behavior of code that expects it.
  • CDH reverted API changes in HBase 1.0 which broke compatibility with HBase in CDH 5.0, 5.1, 5.2, and 5.3. If you have written applications using Apache HBase 1.0 APIs, you may need to modify these applications to run in CDH 5.4.

Differences between CDH 5.4 HBase 1.0 and Apache HBase 1.0:

  • CDH 5.4.0 keeps commons-math at version 2.1 to maintain compatibility with earlier CDH releases, whereas Apache HBase 1.0 uses commons-math 2.2.
  • CDH 5.4.0 keeps Netty at version 3 to maintain compatibility with earlier CDH releases, whereas Apache HBase 1.0 uses Netty 4.

Compatibility Notes for CDH 5.3

  • The Put class no longer implements Writable. Instead, you can change the definition to org.apache.hadoop.mapreduce.TaskInputOutputContext<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Result,org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Put> if you have only Puts, or org.apache.hadoop.mapreduce.TaskInputOutputContext<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Result,org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Mutation> if you have a mix of Puts, Gets, and Deletes.

Compatibility Notes for CDH 5.2

  • In HBase in CDH 5.1, the default value for hbase.security.access.early_out was set to false. In CDH 5.2, the default value has been changed to true. When set to true, if a user is not granted access to a column family qualifier, the AccessController immediately throws an AccessDeniedException. This change to the default behavior will affect users who enabled HFile version 3 and the AccessController coprocessor in CDH 5.1, and then upgrade to CDH 5.2. In this case, if you prefer hbase.security.access.early_out to be disabled, explicitly set it to false in hbase-site.xml.
  • Starting with CDH 5.2, you can specify a global default number of versions, which will be applied to all newly created tables where the number of versions is not otherwise specified, by setting hbase.column.max.version to the desired number of versions in hbase-site.xml.
  • HBase in CDH 5.2 differs from Apache HBase 0.98.6 in that CDH does not include HBASE-11546, which provides ZooKeeper-less region assignment. CDH omits this feature because it is an incompatible change that prevents an upgraded cluster from being rolled back to a previous version.

Developer Interface Changes

  • HBase 0.98.5 removed ClientSmallScanner from the public API. HBase in CDH 5.2 restores the constructor to maintain backward compatibility, but in future releases of HBase, this class will no longer be public. You should change your code to use the Scan.setSmall(true) method instead.

Compatibility Notes for CDH 5.1

General Notes

  • HBASE-8218 changes AggregationClient by replacing the byte[] tablename parameters with HTable table. This means that coprocessors compiled against CDH 5.0.x won't run or compile in CDH 5.1 and later.
  • In CDH 5.1 and later, delete* methods of the Delete class of the HBase Client API use the timestamp from the constructor, the same behavior as the Put class. (In previous versions, the delete* methods ignored the constructor's timestamp, and used the value of HConstants.LATEST_TIMESTAMP. This behavior was different from the behavior of the add() methods of the Put class.) See HBASE-10964.
  • In CDH 5.1 and newer, HBase introduces a new snapshot format (HBASE-7987). A snapshot created in HBase 0.98 cannot be read by HBase 0.96. HBase 0.98 can read snapshots produced in previous versions of HBase, and no conversion is necessary.
  • In CDH 5.1, the default value for hbase.security.access.early_out was changed from true to false. A setting of true means that if a user is not granted access to a column family qualifier, the AccessController immediately throws an AccessDeniedException. This behavior change was reverted for CDH 5.2.

Developer Interface Changes

  • HTablePool is no longer supported in CDH 5.1 and later. The HConnection object is the replacement. You create the connection once and pass it around, as with the old table pool.
    HConnection connection = HConnectionManager.createConnection(config);
    HTableInterface table = connection.getTable(tableName);
    table.put(put);
    table.close();
    connection.close();
    You can set the hbase.hconnection.threads.max property in hbase-site.xml to control the pool size or you can pass an ExecutorService to HConnectionManager.createConnection().
    ExecutorService pool = ...;
    HConnection connection = HConnectionManager.createConnection(conf, pool);

Compatibility Notes for CDH 5 Beta Releases

The HBase client from CDH 5 Beta 1 is not wire compatible with CDH 5 Beta 2 because of changes introduced in HBASE-9612. As a consequence, CDH 5 Beta 1 users will not be able to execute a rolling upgrade to CDH 5 Beta 2 (or later). This patch unifies the way the HBase clients make requests and simplifies the internals, but breaks wire compatibility. Developers may need to recompile applications built upon the CDH 5 Beta 1 API.

As of CDH 5 Beta 1 (HBase 0.95), the value of hbase.regionserver.checksum.verify defaults to true; in earlier releases the default is false.

Compatibility between CDH Beta and Apache HBase Releases

  • Apache HBase 0.95.2 is not wire compatible with CDH 5 Beta 1 HBase 0.95.2.
  • Apache HBase 0.96.x should be wire compatible with CDH 5 Beta 2 HBase 0.96.1.1.