Lily HBase Batch Indexing for Cloudera Search

With Cloudera Search, you can batch index HBase tables using the Lily HBase batch indexer MapReduce job, also known as HBaseMapReduceIndexerTool. This batch indexing does not require:

  • HBase replication
  • The Lily HBase Indexer Service
  • Registering a Lily HBase Indexer configuration with the Lily HBase Indexer Service

The indexer supports flexible, custom, application-specific rules to extract, transform, and load HBase data into Solr. Solr search results can contain columnFamily:qualifier links back to the data stored in HBase. This way, applications can use the search result set to directly access matching raw HBase cells.

The following procedures demonstrate creating a small HBase table and using the HBaseMapReduceIndexerTool to index the table into a collection: