Administrators edit three configuration files to configure a Flume agent:
flume.conf
flume-env.sh
log4j.properties
flume.conf
Configure each Flume agent by defining properties in a configuration file at
/etc/flume/conf/flume.conf
. The init scripts installed by the
flume-agent
package read the contents of this file to start a single
Flume agent on any host. At a minimum, the Flume configuration file must specify the
required sources, channels, and sinks for your Flume topology. For example, the following sample Flume
configuration file defines a Netcat source, a Memory channel, and a Logger sink:
# example.conf: A single-node Flume configuration # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = localhost a1.sources.r1.port = 44444 # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
See the Apache Flume 1.4.0 User Guide for a complete list of all available Flume
components. To see what configuration properties you can adjust, a template for this
file is installed in the configuration directory at:
/etc/flume/conf/flume.conf.properties.template
. A second
template file exists for setting environment variables automatically at start-up:
/etc/flume/conf/flume-env.sh.template
.
Note | |
---|---|
Make sure to specify a target folder in HDFS if you use an HDFS sink. |
flume-env.sh
Set environment options for a Flume agent in
/etc/flume/conf/flume-env.sh
:
To enable JMX monitoring, add the following properties to the
JAVA_OPTS
property:JAVA_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=4159 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
To enable Ganglia monitoring, add the following properties to the
JAVA_OPTS
property:JAVA_OPTS="-Dflume.monitoring.type=ganglia -Dflume.monitoring.hosts=<ganglia-server>:8660"
Where <ganglia-server> is the name of the Ganglia server host.
To customize the heap size, add the following properties to the
JAVA_OPTS
property:JAVA_OPTS= "-Xms100m -Xmx4000m"
log4j.properties
Set the log directory for log4j in
/etc/flume/conf/log4j.properties
:
flume.log.dir=/var/log/flume