2015-04-05 77 views
0

我正在使用以下配置將數據從日誌文件推送到hdfs。無法將數據從槽傳輸到hdfs hadoop日誌中

agent.channels.memory-channel.type = memory 
agent.channels.memory-channel.capacity=5000 
agent.sources.tail-source.type = exec 
agent.sources.tail-source.command = tail -F /home/training/Downloads/log.txt 
agent.sources.tail-source.channels = memory-channel 
agent.sinks.log-sink.channel = memory-channel 
agent.sinks.log-sink.type = logger 
agent.sinks.hdfs-sink.channel = memory-channel 
agent.sinks.hdfs-sink.type = hdfs 
agent.sinks.hdfs-sink.batchSize=10 
agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/user/flume/data/log.txt 
agent.sinks.hdfs-sink.hdfs.fileType = DataStream 
agent.sinks.hdfs-sink.hdfs.writeFormat = Text 
agent.channels = memory-channel 
agent.sources = tail-source 
agent.sinks = log-sink hdfs-sink 
agent.channels = memory-channel 
agent.sources = tail-source 
agent.sinks = log-sink hdfs-sink 

我沒有收到任何錯誤消息,但仍然無法找到hdfs中的輸出。 中斷我可以看到匯中斷異常&該日誌文件的一些數據。 我正在運行以下命令: flume-ng agent --conf/etc/flume-ng/conf/--conf-file /etc/flume-ng/conf/flume.conf -Dflume.root.logger = DEBUG, console -n agent;

回答

1

我有一個類似的問題

在我的情況,現在它的工作 下面是conf文件:

#Exec Source 
execAgent.sources=e 
execAgent.channels=memchannel 
execAgent.sinks=HDFS 
#channels 
execAgent.channels.memchannel.type=file 
execAgent.channels.memchannel.capacity = 20000 
execAgent.channels.memchannel.transactionCapacity = 1000 
#Define Source 
execAgent.sources.e.type=org.apache.flume.source.ExecSource 
execAgent.sources.e.channels=memchannel 
execAgent.sources.e.shell=/bin/bash -c 
execAgent.sources.e.fileHeader=false 
execAgent.sources.e.fileSuffix=.txt 
execAgent.sources.e.command=cat /home/sample.txt 
#Define Sink 
execAgent.sinks.HDFS.type=hdfs 
execAgent.sinks.HDFS.hdfs.path=hdfs://localhost:8020/user/flume/ 
execAgent.sinks.HDFS.hdfs.fileType=DataStream 
execAgent.sinks.HDFS.hdfs.writeFormat=Text 
execAgent.sinks.HDFS.hdfs.batchSize=1000 
execAgent.sinks.HDFS.hdfs.rollSize=268435 
execAgent.sinks.HDFS.hdfs.rollInterval=0 
#Bind Source Sink Channel 
execAgent.sources.e.channels=memchannel 
execAgent.sinks.HDFS.channel=memchannel 
` 

我希望這可以幫助你。確實,日誌文件( -

agent.sinks.hdfs-sink.hdfs.filePrefix = log.out

0

放置時HDFS文件,我建議使用前綴配置agent.sources.tail-source.command = tail -F /home/training/Downloads/log.txt)會不斷追加數據?由於您使用了-F的Tail命令,因此只有已更改的數據(在文件內)纔會被轉儲到HDFS

0

@bhavesh:

+0

您先不理解我的問題..現在,故事已經太舊了。 – 2015-09-09 07:37:39