Configuring the Flume Solr Sink
The tutorial provides examples that work with an environment established using a package-based installation. If you installed Cloudera Search using parcels, adjust file paths accordingly.
- Edit /etc/flume-ng/conf/flume.conf to specify the Flume source details and set up the flow. You must set the relative or absolute path to the morphline
configuration file:
agent.sinks.solrSink.morphlineFile = /etc/flume-ng/conf/morphline.conf
- Edit /etc/flume-ng/conf/morphline.conf to specify the Solr location details using a SOLR_LOCATOR. The snippet that
includes the SOLR_LOCATOR might appear as follows:
SOLR_LOCATOR : { # Name of solr collection collection : collection # ZooKeeper ensemble zkHost : "$ZK_HOST" } morphlines : [ { id : morphline1 importCommands : ["org.kitesdk.**", "org.apache.solr.**"] commands : [ { generateUUID { field : id } } { # Remove record fields that are unknown to Solr schema.xml. # Recall that Solr throws an exception on any attempt to load a document that # contains a field that isn't specified in schema.xml. sanitizeUnknownSolrFields { solrLocator : ${SOLR_LOCATOR} # Location from which to fetch Solr schema } } { logDebug { format : "output record: {}", args : ["@{}"] } } { loadSolr { solrLocator : ${SOLR_LOCATOR} } } ] } ]
- Copy flume-env.sh.template to flume-env.sh:
$ sudo cp /etc/flume-ng/conf/flume-env.sh.template \ /etc/flume-ng/conf/flume-env.sh
- Edit /etc/flume-ng/conf/flume-env.sh, inserting or replacing JAVA_OPTS as follows:
JAVA_OPTS="-Xmx500m"
- (Optional) Modify Flume logging settings to facilitate monitoring and debugging:
$ sudo bash -c 'echo "log4j.logger.org.apache.flume.sink.solr=DEBUG" >> \ /etc/flume-ng/conf/log4j.properties' $ sudo bash -c 'echo "log4j.logger.org.kitesdk.morphline=TRACE" >> \ /etc/flume-ng/conf/log4j.properties'
- (Optional) You can configure the location at which Flume finds Cloudera Search dependencies for Flume Solr Sink using SEARCH_HOME. For example, if you installed Flume from a tarball package, you can configure it to find required files by setting SEARCH_HOME. To set
SEARCH_HOME use a command of the form:
$ export SEARCH_HOME=/usr/lib/search