This is the documentation for CDH 5.0.x. Documentation for other versions is available at Cloudera Documentation.

Configuring Flume Solr Sink

  1. Edit /etc/flume-ng/conf/flume.conf to specify the Flume Source details and set up the flow. You must set the relative or absolute path to the morphline configuration file:
    agent.sinks.solrSink.morphlineFile = /etc/flume-ng/conf/morphline.conf
  2. Edit /etc/flume-ng/conf/morphline.conf to specify the Solr location details.
    1. Specify the collection configuration parameter to identify the name of the Solr Collection to use: collection : collection3
    2. Point the zkHost configuration parameter to the address of the SolrCloud ZooKeeper ensemble of the Solr collection. The format is the same as for MapReduceIndexerTool --zk-host. Substitute the corresponding host name for 127.0.0.1, if necessary:
      zkHost : "127.0.0.1:2181/solr"
  3. Copy flume-env.sh.template to flume-env.sh:
    $ sudo cp /etc/flume-ng/conf/flume-env.sh.template \
    /etc/flume-ng/conf/flume-env.sh
  4. Edit /etc/flume-ng/conf/flume-env.sh, inserting or replacing JAVA_OPTS as follows:
    JAVA_OPTS="-Xmx500m"
  5. (Optional) Modify Flume's logging settings to facilitate monitoring and debugging:
    $ sudo bash -c 'echo "log4j.logger.org.apache.flume.sink.solr=DEBUG" >> \
    /etc/flume-ng/conf/log4j.properties'
    $ sudo bash -c 'echo "log4j.logger.org.kitesdk.morphline=TRACE" >> \
    /etc/flume-ng/conf/log4j.properties'
  6. (Optional) You can configure the location at which Flume finds Cloudera Search dependencies for Flume Solr Sink using SEARCH_HOME. For example, if you installed Flume from a tarball package, you can configure it to find required files by setting SEARCH_HOME. To set SEARCH_HOME use a command of the form:
    $ export SEARCH_HOME=/usr/lib/search
      Note: Alternatively, you can add the same setting to flume-env.sh.
Page generated September 3, 2015.