Using the Lily HBase NRT Indexer Service
To indexfor column families of tables in an HBase cluster:
- Enable replication on HBase column families
- Create collections and configurations
- Register a Lily HBase Indexer configuration with the Lily HBase Indexer Service
- Verify that indexing is working
Enabling Replication on HBase Column Families
Ensure that cluster-wide HBase replication is enabled. Use the HBase shell to define column-family replication settings.
For every existing table, set the REPLICATION_SCOPE on every column family that needs to be indexed by issuing a command of the form:
$ hbase shell hbase shell> disable 'record' hbase shell> alter 'record', {NAME => 'data', REPLICATION_SCOPE => 1} hbase shell> enable 'record'
For every new table, set the REPLICATION_SCOPE on every column family that needs to be indexed by issuing a command of the form:
$ hbase shell hbase shell> create 'record', {NAME => 'data', REPLICATION_SCOPE => 1}
Creating Collections and Configurations
The tasks required for the Lily HBase NRT Indexer Services are the same as those described for the Lily HBase Batch Indexer. Follow the steps described in these sections:
Registering a Lily HBase Indexer Configuration with the Lily HBase Indexer Service
When the content of the Lily HBase Indexer configuration XML file is satisfactory, register it with the Lily HBase Indexer Service.This is done with a given SolrCloud collection by uploading the Lily HBase Indexer configuration XML file to ZooKeeper. For example:
$ hbase-indexer add-indexer \ --name myIndexer \ --indexer-conf $HOME/morphline-hbase-mapper.xml \ --connection-param solr.zk=solr-cloude-zk1,solr-cloude-zk2/solr \ --connection-param solr.collection=hbase-collection1 \ --zookeeper hbase-cluster-zookeeper:2181
Verify that the indexer was successfully created as follows:
$ hbase-indexer list-indexers Number of indexes: 1 myIndexer + Lifecycle state: ACTIVE + Incremental indexing state: SUBSCRIBE_AND_CONSUME + Batch indexing state: INACTIVE + SEP subscription ID: Indexer_myIndexer + SEP subscription timestamp: 2013-06-12T11:23:35.635-07:00 + Connection type: solr + Connection params: + solr.collection = hbase-collection1 + solr.zk = localhost/solr + Indexer config: 110 bytes, use -dump to see content + Batch index config: (none) + Default batch index config: (none) + Processes + 1 running processes + 0 failed processes
Use the update-indexer and delete-indexer command-line options of the hbase-indexer utility to manipulate existing Lily HBase Indexers.
For more help, use the following commands:
$ hbase-indexer add-indexer --help $ hbase-indexer list-indexers --help $ hbase-indexer update-indexer --help $ hbase-indexer delete-indexer --help
Verifying that Indexing Works
Add rows to the indexed HBase table. For example:
$ hbase shell hbase(main):001:0> put 'record', 'row1', 'data', 'value' hbase(main):002:0> put 'record', 'row2', 'data', 'value2'
If the put operation succeeds, wait a few seconds, navigate to the SolrCloud UI query page, and query the data. Note the updated rows in Solr.
To print diagnostic information, such as the content of records as they pass through the morphline commands, enable the TRACE log level. For example, you might add two lines to your log4j.properties file:
log4j.logger.org.kitesdk.morphline=TRACE log4j.logger.com.ngdata=TRACE
In Cloudera Manager 5, navigate to
and , and then restart the Lily HBase Indexer Service.Examine the log files in /var/log/hbase-solr/lily-hbase-indexer-* for details.