This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Configuring Flume Solr Sink to Sip from the Twitter Firehose

Edit /etc/flume-ng/conf/flume.conf and replace the following properties with credentials from a valid twitter.com account. The Flume TwitterSource uses the Twitter 1.1 API, which requires authentication of both the consumer (application) and the user (you).

agent.sources.twitterSrc.consumerKey = YOUR_TWITTER_CONSUMER_KEY
agent.sources.twitterSrc.consumerSecret = YOUR_TWITTER_CONSUMER_SECRET
agent.sources.twitterSrc.accessToken = YOUR_TWITTER_ACCESS_TOKEN
agent.sources.twitterSrc.accessTokenSecret = YOUR_TWITTER_ACCESS_TOKEN_SECRET

Generate these four codes using the Twitter developer site by completing the follows steps:

  1. Sign in to https://dev.twitter.com with a Twitter account.
  2. Select My applications from the drop-down menu in the top-right corner, and Create a new application.
  3. Fill in the form to represent the Search installation. This can represent multiple clusters, and does not require the callback URL. Because this will not be a publicly distributed application, the name, description, and website (required fields) do not matter much except to the owner.
  4. Click Create my access token at the bottom of the page. You may have to refresh to see the access token.

Substitute the consumer key, consumer secret, access token, and access token secret into flume.conf. Consider this information confidential, just like your regular Twitter credentials.

To enable authentication, ensure the system clock is set correctly on all nodes where Flume connects to Twitter. Options for setting the system clock include installing NTP and keeping the host synchronized by running the ntpd service or manually synchronizing using the command sudo ntpdate pool.ntp.org. Confirm time is set correctly by ensuring the output of the command date --utc matches the time shown at http://www.time.gov/timezone.cgi?UTC/s/0/java. You can also set the time manually using the date command.

Page generated September 3, 2015.