Using Apache HBase to store and access data
Also available as:
PDF
loading table of contents...

What's New in Apache HBase

HBase in Hortonworks Data Platform (HDP) 3.0 includes the following new features:

  • Procedure V2

    You can use Procedure V2 or procv2, which is an updated framework for executing multi-step, HBase administrative operations when there is a failure. The introduction of this capability is to implement all master operations using procv2 to remove the need for tools like hbck in the future. Use procv2 for creating, modifying and deleting tables. Other systems like new AssignmentManager is implemented using proc-v2.

  • Fully off-heap read/write path

    When you write data into HBase through Put operation, the cell objects do not enter JVM heap until the data is flushed to disk in an HFile. This helps to reduce total heap usage of a RegionServer and it copies less data making it more efficient.

  • Use of Netty for RPC layer and Async API

    This replaces the old Java NIO RPC server with a Netty RPC server. Netty provides you the ability to easily provide an Asynchronous Java client API.

  • In-memory compactions

    Periodic reorganization of the data in the Memstore can result in a reduction of overall I/O, that is data written and accessed from HDFS. The net performance increases when we keep more data in memory for a longer period of time.

  • Better dependency management

    HBase now internally shades commonly-incompatible dependencies to prevent issues for downstream users. You can use shaded client jars that will reduce the burden on the existing applications.

  • Coprocessor and Observer API rewrite

    There are minor changes made to the API to remove ambiguous, misleading, and dangerous calls.

  • Backup/restore

    You can use the built-in tooling in HBase to create full and incremental backups of the HBase data.