Setting Up Apache Flume Using the Command Line

Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized datastore.

Installing the Flume RPM or Debian Packages

The Flume RPM and Debian packages consist of three packages:

  • flume-ng — Everything you need to run Flume
  • flume-ng-agent — Handles starting and stopping the Flume agent as a service
  • flume-ng-doc — Flume documentation
All Flume installations require the common code provided by flume-ng.

To install Flume on Ubuntu and other Debian systems:

$ sudo apt-get install flume-ng

To install Flume On RHEL-compatible systems:

$ sudo yum install flume-ng

To install Flume on SLES systems:

$ sudo zypper install flume-ng

You might also want Flume to run automatically on start-up. To do this, install the Flume agent.

To install the Flume agent so Flume starts automatically on Ubuntu and other Debian systems:

$ sudo apt-get install flume-ng-agent

To install the Flume agent so Flume starts automatically on Red Hat-compatible systems:

$ sudo yum install flume-ng-agent

To install the Flume agent so Flume starts automatically on SLES systems:

$ sudo zypper install flume-ng-agent

To install the documentation:

To install the documentation on Ubuntu and other Debian systems:

$ sudo apt-get install flume-ng-doc

To install the documentation on RHEL-compatible systems:

$ sudo yum install flume-ng-doc

To install the documentation on SLES systems:

$ sudo zypper install flume-ng-doc

Verifying the Flume Installation

At this point, you should have everything necessary to run Flume, and the flume-ng command should be in your $PATH. You can test this by running:

$ flume-ng help

You should see something similar to this:

Usage: /usr/bin/flume-ng <command> [options]...

commands:
  help                  display this help text
  agent                 run a Flume agent
  avro-client           run an avro Flume client
  version               show Flume version info

global options:
  --conf,-c <conf>      use configs in <conf> directory
  --classpath,-C <cp>   append to the classpath
  --dryrun,-d           do not actually start Flume, just print the command
  --Dproperty=value     sets a JDK system property value

agent options:
  --conf-file,-f <file> specify a config file (required)
  --name,-n <name>      the name of this agent (required)
  --help,-h             display help text

avro-client options:
  --rpcProps,-P <file>  RPC client properties file with server connection params
  --host,-H <host>      hostname to which events will be sent (required)
  --port,-p <port>      port of the avro source (required)
  --dirname <dir>       directory to stream to avro source
  --filename,-F <file>  text file to stream to avro source [default: std input]
  --headerFile,-R <file> headerFile containing headers as key/value pairs on each new line
  --help,-h             display help text

  Either --rpcProps or both --host and --port must be specified.

Note that if <conf> directory is specified, then it is always included first
in the classpath.