Apache HBase Incompatible Changes
Compatibility Notes for CDH 5
This section contains information that is relevant for all releases within the CDH 5 family. See the sections below for information which pertains to specific releases within CDH 5. If you are upgrading through more than one version (for instance, from CDH 5.0 to CDH 5.2), read the sections for each version, as most of the information listed applies to the given version and newer releases.
General Notes
- Rolling upgrades from CDH 4 to CDH 5 are not possible because existing CDH 4 HBase clients cannot make requests to CDH 5 servers and CDH 5 HBase clients cannot make requests to CDH 4 servers. Replication between CDH 4 and CDH 5 is not currently supported. Exposed JMX metrics in CDH 4 have been refactored and some have been removed.
- The upgrade from CDH 4 HBase to CDH 5 HBase is irreversible and requires HBase to be shutdown completely.
- As of CDH4.2, the default Split Policy changed from ConstantSizeRegionSplitPolicy to IncreasingToUpperBoundRegionSplitPolicy (ITUBRSP). This affects upgrades from CDH 4.1 or earlier to CDH 5.
- FilterBase no longer implements Writable. This means that you do not need to implement readFields() and write() methods when writing your own custom fields. Instead, put this logic into the toByteArray and parseFrom methods. See this page for an example.
- The default number of retained cell versions is reduced from 3 to 1. To increase the number of versions, you can specify the VERSIONS option at table creation or by altering existing tables. Starting with CDH 5.2, you can specify a global default number of versions, which will be applied to all newly created tables where the number of versions is not otherwise specified, by setting hbase.column.max.version to the desired number of versions in hbase-site.xml.
-
The set of exposed APIs has been solidified. If you are using APIs outside of the user API, we cannot guarantee compatibility with future minor versions.
-
CDH 5 introduces a new layout for HBase build artifacts and requires POM changes if you use Maven, or JAR changes otherwise.
Previously, in CDH 4 you only needed to add a dependency for the HBase JAR:<dependency> <groupId> org.apache.hbase </groupId> <artifactId> hbase </artifactId> <optional> true </optional> </dependency>
Now, when building against CDH 5 you will need to add a dependency for the hbase-client JAR. The hbase module continues to exist as a convenient top-level wrapper for existing clients, and it pulls in all the sub-modules automatically. But it is only a simple wrapper, so its repository directory will carry no actual jars.<dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-client</artifactId> <version>${hbase.version}</version> </dependency>
If your code uses the HBase minicluster, you can pull in the hbase-testing-util dependency:<dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-testing-util</artifactId> <version>${cdh.hbase.version}</version> </dependency>
If you need to obtain all HBase JARs required to build a project, copy them from the CDH installation directory (typically /usr/lib/hbase for an RPM install, or /opt/cloudera/parcels/CDH/lib/hbase if you install using Parcels), or from the CDH 5 HBase tarballs. However, for building client applications, Cloudera recommends using build tools such as Maven, rather than manually referencing JARs.
- CDH 5 introduces support for addressing cells with an
empty column qualifier (a string of 0 bytes in length), but not all edge
services handle that scenario correctly. In some cases, attempting to
address a cell at [rowkey, fam] results in
interaction with the entire column family, rather than the empty column
qualifier.
Users of the HBase Shell, MapReduce, REST, and Thrift must use family instead of family: (notice the omitted ":"), to interact with an emtire column family, rather than an empty column qualifier. Including the ":" will be interpreted as an interaction with the empty qualifier in the family column family
-
API Removals
- HBASE-7315/HBASE-7263 - Row lock user API has been removed.
- HBASE-6706 - Removed total order partitioner. This is related to no longer supporting HBase on Hadoop-0.20.x
- Many of the default configurations from CDH 4 in hbase-default.xml have been changed to new values in CDH 5. See HBASE-8450 for a complete list of changes.
- HBASE-6553 - Removed Avro Gateway. This feature was less robust and not used as much as the Thrift gateways. It has been removed upstream.
-
HBase provides a metrics framework based on JMX beans. Between HBase 0.94 and 0.96, the metrics framework underwent many changes. Some beans were added and removed, some metrics were moved from one bean to another, and some metrics were renamed or removed. Click here to download the CSV spreadsheet which provides a mapping.
- The HBase User API (Get, Put, Result, Scanner etc; see Apache HBase API documentation) has evolved and attempts have been made to make sure the HBase Clients are source code compatible and thus should recompile without needing any source code modifications. This cannot be guaranteed however, since with the conversion to ProtoBufs, some relatively obscure APIs have been removed. Rudimentary efforts have also been made to preserve recompile compatibility with advanced APIs such as Filters and Coprocessors. These advanced APIs are still evolving and our guarantees for API compatibility are weaker here.
- As of 0.96, the User API has been marked and all attempts at compatibility in future versions will be made. A version of the javadoc that only contains the User API can be found here.
- Other changes to CDH 5 HBase that require the
upgrade include:
- HBASE-8015: The HBase Namespaces feature has changed HBase’s HDFS file layout.
- HBASE-4451: Renamed ZooKeeper nodes.
- HBASE-3171: The META table in CDH 4 has been renamed to be hbase:meta. Similarly the ACL table has been renamed to hbase:acl. The .ROOT table has been removed.
- HBASE-8352: HBase snapshots are now saved to the /<hbase>/.hbase-snapshot dir instead of the /.snapshot dir. This should be handled before upgrading HDFS.
- HBASE-7660: Removed support for HFile V1. All internal HBase files in the HFile v1 format must be converted to the HFile v2 format.
- HBASE-6170/HBASE-8909 - The hbase.regionserver.lease.period configuration parameter has been deprecated. Use hbase.client.scanner.timeout.period instead.
Compatibility Notes for CDH 5.1
General Notes
- HBASE-8218 changes AggregationClient by replacing the byte[] tablename parameters with HTable table. This means that coprocessors compiled against CDH 5.0.x won't run or compile in CDH 5.1 and later.
- In CDH 5.1 and later, delete* methods of the Delete class of the HBase Client API use the timestamp from the constructor, the same behavior as the Put class. (In previous versions, the delete* methods ignored the constructor's timestamp, and used the value of HConstants.LATEST_TIMESTAMP. This behavior was different from the behavior of the add() methods of the Put class.) See HBASE-10964.
- In CDH 5.1 and newer, HBase introduces a new snapshot format (HBASE-7987). A snapshot created in HBase 0.98 cannot be read by HBase 0.96. HBase 0.98 can read snapshots produced in previous versions of HBase, and no conversion is necessary.
- In CDH 5.1, the default value for hbase.security.access.early_out was changed from true to false. A setting of true means that if a user is not granted access to a column family qualifier, the AccessController immediately throws an AccessDeniedException. This behavior change was reverted for CDH 5.2.
-
HTablePool is no longer supported in CDH 5.1 and later. The HConnection object is the replacement. You create the connection once and pass it around, as with the old table pool.
HConnection connection = HConnectionManager.createConnection(config); HTableInterface table = connection.getTable(tableName); table.put(put); table.close(); connection.close();
You can set the hbase.hconnection.threads.max property in hbase-site.xml to control the pool size or you can pass an ExecutorService to HConnectionManager.createConnection().ExecutorService pool = ...; HConnection connection = HConnectionManager.createConnection(conf, pool);
Compatibility Notes for CDH 5 Beta Releases
CDH 5 Beta 1 and Beta 2 are not intended for production use, and have been superseded by official releases in the CDH 5 family.
The HBase client from CDH 5 Beta 1 is not wire compatible with CDH 5 Beta 2 because of changes introduced in HBASE-9612. As a consequence, CDH 5 Beta 1 users will not be able to execute a rolling upgrade to CDH 5 Beta 2 (or later). This patch unifies the way the HBase clients make requests and simplifies the internals, but breaks wire compatibility. Developers may need to recompile applications built upon the CDH 5 Beta 1 API.
As of CDH 5 Beta 1 (HBase 0.95), the value of hbase.regionserver.checksum.verify defaults to true; in earlier releases the default is false. For more information, see Checksums in the HBase section of the CDH 5 Installation Guide.
Compatibility between CDH Beta and Apache HBase Releases
- Apache HBase 0.95.2 is not wire compatible with CDH 5 Beta 1 HBase 0.95.2.
- Apache HBase 0.96.x should be wire compatible with CDH 5 Beta 2 HBase 0.96.1.1.
<< Apache Hadoop Incompatible Changes | Apache Hive Incompatible Changes >> | |