Command Line Upgrade
Also available as:
PDF
loading table of contents...

Configure, Start, and Validate Apache Flume

  1. If you have not already done so, upgrade Apache Flume. On the Flume host machine, run the following command:

    • For RHEL/CentOS/Oracle Linux:

      yum upgrade flume

    • For SLES :

      zypper update flume

      zypper remove flume

      zypper se -s flume

      You should see Flume in the output.

      Install Flume:

      zypper install flume

    • For Ubuntu/Debian:

      apt-get install flume

  2. To confirm that Flume is working correctly, create an example configuration file. The following snippet is a sample configuration that can be set using the properties file. For more detailed information, see the “Flume User Guide.”

    agent.sources = pstream 
    agent.channels = memoryChannel
    agent.channels.memoryChannel.type = memory 
    
    agent.sources.pstream.channels = memoryChannel 
    agent.sources.pstream.type = exec 
    agent.sources.pstream.command = tail -f /etc/passwd 
    
    agent.sinks = hdfsSink
    agent.sinks.hdfsSink.type = hdfs 
    agent.sinks.hdfsSink.channel = memoryChannel
    agent.sinks.hdfsSink.hdfs.path = hdfs://tmp/flumetest 
    agent.sinks.hdfsSink.hdfs.fileType = SequenceFile 
    agent.sinks.hdfsSink.hdfs.writeFormat = Text

    The source here is defined as an exec source. The agent runs a given command on startup, which streams data to stdout, where the source gets it. The channel is defined as an in-memory channel and the sink is an HDFS sink.

  3. Given this configuration, you can start Flume by navigating to FLUME_HOME and executing the following command:

    $ bin/flume-ng agent --conf ./conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console
    [Note]Note

    The directory specified for --conf agrument would include a shell script flume-env.sh and potentially a log4j properties file. In this example, we pass a Java option to force Flume to log to the console and we go without a custom environment script.

  4. After validating data in hdfs://tmp/flumetest, stop Flume and resore any backup files. Copy /etc/flume/conf to the conf directory in Flume hosts.