Starting Flume Agent
- Delete all existing documents in
Solr:
$ solrctl --zk collection --deletedocs collection3
- Check the status of the Flume Agent to determine if
it is running or not:
$ sudo /etc/init.d/flume-ng-agent status
- Use the start or
restart functions. For example, to restart a running Flume
Agent:
$ sudo /etc/init.d/flume-ng-agent restart
- Monitor progress in the Flume log file and watch
for any errors:
$ tail -f /var/log/flume-ng/flume.log
After restarting the Flume agent, use the Cloudera Search GUI. For example, for the localhost, use http://localhost:8983/solr/collection3/select?q=*%3A*&sort=created_at+desc&wt=json&indent=true to verify that new tweets have been ingested into Solr. Note that the query sorts the result set such that the most recently ingested tweets are at the top, based on the created_at timestamp. If you rerun the query, new tweets show up at the top of the result set.
To print diagnostic information, such as the content of records as they pass through the morphline commands, consider enabling TRACE log level. For example, you can enable TRACE log level diagnostics by adding the following to your log4j.properties file:
log4j.logger.org.kitesdk.morphline=TRACE
In Cloudera Manager, you can use the safety valve to enable TRACE log level. Note that you must set the value of Kite or CDK that applies to your installation.
Navigate to
. After setting this value, restart the service.<< Configuring Flume Solr Sink to Sip from the Twitter Firehose | Indexing a File Containing Tweets with Flume HTTPSource >> | |