Writing Data to HBase
The storm-hbase
connector supports the following key features:
Apache HBase 0.96 and above
Incrementing counter columns
Tuples failure if an update to an HBase table fails
Ability to group puts in a single batch
Writing to Kerberized HBase clusters (for more information, see Configuring Connectors for a Secure Cluster)
The storm-hbase
connector enables Storm developers to collect
several PUTS in a single operation and write to
multiple HBase column families and counter columns. A PUT is an HBase operation that
inserts data into a single HBase cell. Use the HBase client's write buffer to
automatically batch: hbase.client.write.buffer
. The primary
interface in the storm-hbase
connector is the
org.apache.storm.hbase.bolt.mapper.HBaseMapper
interface.
However, the default implementation, SimpleHBaseMapper
, writes a
single column family. Storm developers can implement the
HBaseMapper
interface themselves or extend
SimpleHBaseMapper
if they want to change or override this
behavior.
Table 7.1. SimpleHBaseMapper Methods
SimpleHBaseMapper Method | Description |
---|---|
| Specifies the row key for the target HBase row. A row key uniquely identifies a row in HBase. |
| Specifies the target HBase column. |
| Specifies the target HBase counter. |
| Specifies the target HBase column family. |
Example
The following example specifies the 'word' tuple as the row key, adds an HBase column for the tuple 'word' field, adds an HBase counter column for the tuple 'count' field, and writes data to the 'cf' column family.
SimpleHBaseMapper mapper = new SimpleHBaseMapper() .withRowKeyField("word") .withColumnFields(new Fields("word")) .withCounterFields(new Fields("count")) .withColumnFamily("cf");
The storm-hbase
connector supports the following versions of HBase:
0.96
0.98
Limitations
The current version of the storm-hbase
connector has the following limitations:
HBase table must be predefined
Cannot dynamically add new HBase columns; can write to only one column family at a time
Assumes that
hbase-site.xml
is in the$CLASSPATH
environment variableTuple field names must match HBase column names
Does not support the Trident API
Supports writes but not lookups