Lily HBase batch indexing for Cloudera Search
You can batch index HBase tables using the Lily HBase batch indexer MapReduce job (HBaseMapReduceIndexerTool). This batch indexing does not require HBase replication or the Lily HBase Indexer Service. Subsequently you do not need to register a Lily HBase Indexer configuration with the Lily HBase Indexer Service.
The indexer supports flexible, custom, application-specific rules to
extract, transform, and load HBase data into Solr. Solr search results can
contain columnFamily:qualifier
links back to the data
stored in HBase. This way, applications can use the search result set to
directly access matching raw HBase cells.
The following procedures demonstrate creating a small HBase table and using the HBaseMapReduceIndexerTool to index the table into a collection: