This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Starting Flume Agent

  1. Delete all existing documents in Solr:
    $ solrctl --zk collection --deletedocs collection3
  2. Check the status of the Flume Agent to determine if it is running or not:
    $ sudo /etc/init.d/flume-ng-agent status
  3. Use the start or restart functions. For example, to restart a running Flume Agent:
    $ sudo /etc/init.d/flume-ng-agent restart
  4. Monitor progress in the Flume log file and watch for any errors:
    $ tail -f /var/log/flume-ng/flume.log

After restarting the Flume agent, use the Cloudera Search GUI. For example, for the localhost, use http://localhost:8983/solr/collection3/select?q=*%3A*&sort=created_at+desc&wt=json&indent=true to verify that new tweets have been ingested into Solr. Note that the query sorts the result set such that the most recently ingested tweets are at the top, based on the created_at timestamp. If you rerun the query, new tweets show up at the top of the result set.

To print diagnostic information, such as the content of records as they pass through the morphline commands, consider enabling TRACE log level. For example, you can enable TRACE log level diagnostics by adding the following to your log4j.properties file:

log4j.logger.org.kitesdk.morphline=TRACE

In Cloudera Manager, you can use the safety valve to enable TRACE log level.

Navigate to Menu Services > Flume > Configuration > View and Edit > Agent > Advanced > Agent Logging Safety Valve. After setting this value, restart the service.

Page generated September 3, 2015.