Apache Flume Moving Tomcat Logs to HDFS

Now we will see, how you can move apache tomcat logs into the HDFS.

Step 1 – Change the directory to /usr/local/hadoop/sbin

$ cd /usr/local/hadoop/sbin

Step 2 – Start all hadoop daemons.

$ start-all.sh

Step 3 – The JPS (Java Virtual Machine Process Status Tool) tool is limited to reporting information on JVMs for which it has the access permissions.

$ jps

Step 4 – Create a /user/hduser/flumedata folder in HDFS.

$ hdfs dfs -mkdir hdfs://localhost:9000/flumedata

Step 5 – Change the directory to /usr/local/tomcat/bin


Step 6 – Starting the tomcat web server.

$ ./startup.sh

Step 7 – Check the web here. Open a browser and type the following URL.

Step 8 – Change the directory to /usr/local/flume


Step 9 – Configuration File

Given below is an example of the configuration file. Copy this content and save as nethd.conf. In my case, net.conf files is in /usr/local/flume/conf/ folder.

Dont forget to change this line with your tomcat log file name

agent.sources.tail‐source.command = cat ‐F /usr/local/tomcat/logs/access_log.2015-12-26.txt


agent.sources = tail‐source
agent.channels = memoryChannel 
agent.sinks = hdfs‐sink 

agent.sources.tail‐source.type = exec
agent.sources.tail‐source.command = cat ‐F /usr/local/tomcat/logs/access_log.2015-12-26.txt
#agent.sources.tail‐source.batchSize = 10
agent.sources.tail‐source.channels = memoryChannel 

agent.channels.memoryChannel.type = memory
#agent.channels.memoryChannel.capacity = 100000
#agent.channels.memoryChannel.transactionCapacity = 10000

agent.sinks.hdfs‐sink.type = hdfs 
agent.sinks.hdfs‐sink.channel = memoryChannel 
agent.sinks.hdfs‐sink.hdfs.path = hdfs://localhost:9000/flumedata/
agent.sinks.hdfs‐sink.hdfs.fileType = DataStream
agent.sinks.hdfs‐sink.hdfs.writeFormat = Text

#agent.sinks.hdfs‐sink.hdfs.batchSize = 10
#agent.sinks.hdfs‐sink.hdfs.rollSize = 0
#agent.sinks.hdfs‐sink.hdfs.rollCount = 10
#agent.sinks.hdfs‐sink.hdfs.rollInterval = 30

Step 10 – Execution

$ bin/flume-ng agent -c ./conf -f conf/flume.conf --name agent -Dflume.root.logger=INFO,console

