Configuring Flume
To configure a Flume agent, edit the following three configuration files:
flume.conf
flume-env.sh
log4j.properties
flume.conf
Configure each Flume agent by defining properties in a configuration file at
/etc/flume/conf/flume.conf
. The init scripts installed by the flume-agent package read
the contents of this file when starting a Flume agent on any host. At a minimum, the
Flume configuration file must specify the required sources, channels, and sinks for your Flume topology.
For example, the following sample Flume configuration file defines a NetCat Source, a Memory Channel and a Logger Sink. This configuration lets a user generate events and subsequently logs them to the console.
# example.conf: A single-node Flume configuration # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = localhost a1.sources.r1.port = 44444 # Describe the sink a1.sinks.k1.type = logger # Use a channel that buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
This configuration defines a single agent named a1. a1 has a source that listens for data on port 44444, a channel that buffers event data in memory, and a sink that logs event data to the console. The configuration file names the various components, and describes their types and configuration parameters. A given configuration file might define several named agents.
See the Apache Flume User Guide for a complete list of all available Flume components.
To see what configuration properties you can adjust, a template for this file is
installed in the configuration directory at
/etc/flume/conf/flume.conf.properties.template
.
A second template file exists for setting environment variables automatically at startup:
/etc/flume/conf/flume- env.sh.template
.
Note | |
---|---|
If you use an HDFS Sink, be sure to specify a target folder in HDFS. |
flume-env.sh
Set environment options for a Flume agent in /etc/flume/conf/flume-env.sh
:
To enable JMX monitoring, add the following properties to the JAVA_OPTS property:
JAVA_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=4159 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
To customize the heap size, add the following properties to the JAVA_OPTS property:
JAVA_OPTS= "-Xms100m -Xmx4000m"
log4j.properties
Set the log directory for log4j in /etc/flume/conf/log4j.properties
:
flume.log.dir=/var/log/flume