Configure, Start, and Validate Apache Flume
Before you can upgrade Apache Flume, you must have first upgraded your HDP components to the latest version (in this case, 2.4.2). This section assumes that you have already upgraded your components for HDP 2.4.2. If you have not already completed these steps, return to Getting Ready to Upgrade and Upgrade 2.1 Components for instructions on how to upgrade your HDP components to 2.4.2.
If you have not already done so, upgrade Flume. On the Flume host machine, run the following command:
For RHEL/CentOS/Oracle Linux:
yum upgrade flume
For SLES:
zypper update flume
zypper remove flume
zypper se -s flume
You should see Flume in the output.
Install Flume:
zypper install flume
For Ubuntu/Debian:
HDP support for Debian 6 is deprecated with HDP 2.4.2. Future versions of HDP will no longer be supported on Debian 6.
apt-get install flume
To confirm that Flume is working correctly, create an example configuration file. The following snippet is a sample configuration that can be set using the properties file. For more detailed information, see the “Flume User Guide.”
agent.sources = pstream agent.channels = memoryChannel agent.channels.memoryChannel.type = memory agent.sources.pstream.channels = memoryChannel agent.sources.pstream.type = exec agent.sources.pstream.command = tail -f /etc/passwd agent.sinks = hdfsSink agent.sinks.hdfsSink.type = hdfs agent.sinks.hdfsSink.channel = memoryChannel agent.sinks.hdfsSink.hdfs.path = hdfs://tmp/flumetest agent.sinks.hdfsSink.hdfs.fileType = SequenceFile agent.sinks.hdfsSink.hdfs.writeFormat = Text
The source here is defined as an exec source. The agent runs a given command on startup, which streams data to stdout, where the source gets it. The channel is defined as an in-memory channel and the sink is an HDFS sink.
Given this configuration, you can start Flume by navigating to FLUME_HOME and executing the following command:
$ bin/flume-ng agent --conf ./conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console
Note The directory specified for
--conf agrument
would include a shell script flume-env.sh and potentially a log4j properties file. In this example, we pass a Java option to force Flume to log to the console and we go without a custom environment script.After validating data in
hdfs://tmp/flumetest
, stop Flume and restore any backup files. Copy/etc/flume/conf
to the conf directory in Flume hosts.