2014-10-06 50 views
0

我想解析由BitTorrent Sync使用Python創建的調試日誌。我正在使用的樣本日誌相當大(〜10 MB),所以如果有人想看看這裏看到:http://www.datafilehost.com/d/01a0ae7c(7z壓縮;不要使用他們的下載器)使用Python的BitTorrent同步日誌解析

我想創建一個CSV文件,包括文件名,開始時間&結束時間(接收),無塊,平均時間&傳輸速度等

我得到了我的想法,從Python的文件...作爲一個菜鳥,它會需要一些時間。從日誌

例子:

[20141006 12:10:42.290] SyncFilesController: Got file from remote (192.168.3.13:41740): AUD_30_3822029472_1442025768_20140923053708.out state: 1 type: file total:801 have:801 t:1412577403 mt:1412577403 7842B73B340FD81AB3B426CBB0822FE68FF156B7 
[20141006 12:10:43.684] /home/de/Desktop/Sync_test/AUD_16_1122404893_7156305832_20131013215115.out: Piece 3 complete 
[20141006 12:11:03.951] Finished downloading file AUD_16_1122404893_7156305832_20131013215115.out, writing file attributes mt:1412577397 

我的第一種方法:

log=open("sync_de.log",'r'); 
fn=open("fn.log",'w'); 
st=open("st.log",'w'); 
et=open("et.log",'w'); 

for eachline in log: 
    if 'Got file from remote' in eachline: 
     fn.write(str(eachline[88:135]) + '\n') 

    elif 'event = "IN_CREATE"' in eachline: 
     st.write(str(eachline[43:90] + ': ' + eachline[10:22]) + '\n') 

    elif 'Finished downloading file' in eachline: 
     et.write(str(eachline[50:97] + ': ' + eachline[10:22]) + '\n') 

如何合併這些數據,而不將它們存儲在單獨的文件?任何幫助表示讚賞。

回答

0

鐵托

很多選擇。其中之一:

創建一個列表,解析行(你正在做的方式是好的),將信息作爲列表放入列表中。 用csv模塊輸出全部。

例:

import csv 
log=open("sync_de.log",'r') 
out=open("fn.csv",'w') 
csv = csv.writer(out) 

# to store the list 
out_list = [] 

for eachline in log: 
    # your code 
    if blabla: 
     out_list.append([filename, start, end]) 

# write the csv 
csv.writerows(out_list) 

了更多的選擇,檢查csv模塊 https://docs.python.org/2/library/csv.html

+0

哦,10mb是蟒蛇小菜一碟。您可以輕鬆加載4GB,10GB的文件,具體取決於您的內存。 – 2014-10-06 08:53:25

+0

謝謝!它看起來很有希望,但問題是'if'條件是動態的,即如果每行中'從遠程獲取文件'爲true,則每行[88:135]給出文件名,而每行中'event ='IN_CREATE''爲真那麼eachline [10:22]給出了開始時間,儘管那些2'每一行可能不同。 – Tito 2014-10-06 09:29:09

+0

'in'會在字符串中找到它,而不管它在哪裏。你可以嘗試正則表達式。這是最強大的東西。它會打你的心... – 2014-10-06 09:53:14

0

讓我們的另一種方法。 fn_c.log的(文件名&沒有大塊的)的一個片段:

AUD_16_1122404893_7156305832_20131013215115.out: 801 
AUD_30_3822029472_1442025768_20140923053708.out: 801 
AUD_59_3579916998_7069213690_20130110135656.out: 801 

st.log(排序後開始時間):

AUD_16_1122404893_7156305832_20131013215115.out: 12:10:43.117 
AUD_30_3822029472_1442025768_20140923053708.out: 12:11:03.951 
AUD_59_3579916998_7069213690_20130110135656.out: 12:12:10.933 

et.log(結束排序後的時間):

AUD_16_1122404893_7156305832_20131013215115.out: 12:11:03.951 
AUD_30_3822029472_1442025768_20140923053708.out: 12:12:10.933 
AUD_59_3579916998_7069213690_20130110135656.out: 12:12:34.120 

等等。現在我怎樣才能把它們結合起來:

1st column: File name 
2nd column: No of chunks 
3rd column: Start time 

等等。

基本上我打算讀僅文件名存儲&一個文件,然後拿起從不同的文件對應的值,比如:

file = open("fn.log",'r'); 
log=open("sync_de.log",'r'); 
cr = open("cr.log",'w'); 
line = file.readline() 

for u in line.split(): 
    for eachline in log: 
     if 'event = "IN_CREATE"' in eachline and u in eachline: 
      cr.write(str(u + ',' + eachline[10:22]) + '\n') 

但只有第一個寫的是:

AUD_16_1122404893_7156305832_20131013215115.out,12:10:43.117 
+0

使用字典和密鑰作爲索引: begin = {} end = {} begin ['filename'] = time end [ 'filename'] = end_time 關鍵在begin.keys():#爲所有文件作爲鍵 write(key,begin [key],end [key]) – 2014-10-06 13:52:53

0

好,我已經設法以非常簡單的方式完成它。

import os,math 
from datetime import * 

thor = raw_input("Enter the name of the BitTorrent Sync debug log file: ") 
odin = raw_input("Enter the name of the folder where synced files are stored: ") 

report = open("report.log",'w'); 
sync_log = open(thor,'r'); 
fn = open("fn.log",'w'); 
pre_st = open("pre_st.log",'w'); 
pre_et = open("pre_et.log",'w'); 

report.write('Name of the files & corresponding no of chunks' + '\n') 
report.write('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' + '\n') 

i=0 
for eachline in sync_log: 
    if 'Got file from remote' in eachline: 
     i=i+1 
     fn.write(str(eachline[88:135]) + ' ' + str(eachline[162:165]) + '\n') 
     report.write(str(eachline[88:135]) + ' ' + str(eachline[162:165]) + '\n') 
    elif 'event = "IN_CREATE"' in eachline: 
     pre_st.write(str(eachline[43:90] + ' ' + eachline[10:22]) + '\n') 
    elif 'Finished downloading file' in eachline: 
     pre_et.write(str(eachline[50:97] + ' ' + eachline[10:22]) + '\n') 
report.write('\n') 

report.write('Total no of files' + '\n') 
report.write('~~~~~~~~~~~~~~~~~' + '\n') 
report.write(str(i) + '\n' + '\n') 

report.write('Size of the files (in bytes)' + '\n') 
report.write('~~~~~~~~~~~~~~~~~~~~~~~~~~~~' + '\n') 
dir = odin 
fs = open("fs.log",'w'); 
for f in os.listdir(dir): 
    if os.path.isfile(dir + '/' + f) and len(f) == 47: 
     fs.write(str(os.path.getsize(dir + '/' + f)) + '\n') 
     report.write(f + ' ' + str(os.path.getsize(dir + '/' + f)) + '\n') 

report.write('\n') 
report.write('Size of the chunks of the files (in bytes)' + '\n') 
report.write('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' + '\n') 

fs = open("fs.log",'r'); 
fn = open("fn.log",'r'); 
for eachline1, eachline2 in zip(fs, fn): 
    chunk_size = float(eachline1)/float(eachline2[48:51]) 
    report.write(eachline2[:47] + ' ' + "%.2f" % chunk_size + '\n') 

report.write('\n') 
report.write('Sync starting time' + '\n') 
report.write('~~~~~~~~~~~~~~~~~~' + '\n') 

pre_st = open("pre_st.log", 'r') 
st = open("st.log", 'w') 
lineList = pre_st.readlines() 
lineList.sort() 
for line in lineList: 
    st.write(line) 
    report.write(line) 
report.write('\n') 
pre_st.close() 

report.write('Sync ending time' + '\n') 
report.write('~~~~~~~~~~~~~~~~' + '\n') 

pre_et = open("pre_et.log", 'r') 
et = open("et.log", 'w') 
lineList = pre_et.readlines() 
lineList.sort() 
for line in lineList: 
    et.write(line) 
    report.write(line) 
report.write('\n')  
pre_et.close() 

report.write('Time required for syncing (in seconds)' + '\n') 
report.write('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' + '\n') 

st = open("st.log",'r'); 
et = open("et.log",'r'); 
tt = open("tt.log",'w'); 
for eachline1, eachline2 in zip(st, et): 
     if eachline1[:47] == eachline2[:47]: 
      s_t=datetime.strptime(eachline1[48:60], "%H:%M:%S.%f") 
      e_t=datetime.strptime(eachline2[48:60], "%H:%M:%S.%f") 
      diff=e_t-s_t 
      tt.write(str(diff.seconds) + '\n') 
      report.write(eachline1[:47] + ' ' + str(diff.seconds) + '\n') 

report.write('\n') 
report.write('Average syncing speed (in KBPS)' + '\n') 
report.write('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' + '\n') 

fs = open("fs.log",'r'); 
tt = open("tt.log",'r'); 
ts = open("ts.log",'w'); 
for eachline1, eachline2 in zip(fs, tt): 
    kbps = float(eachline1)/float(eachline2)/1024 
    ts.write("%.2f" % kbps + '\n') 
    report.write("%.2f" % kbps + '\n') 

report.write('\n') 
report.write('Some statistics regarding syncing speed' + '\n') 
report.write('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' + '\n') 

ts = open("ts.log",'r'); 
sum=0 
sumsq=0 
for eachline in ts: 
    sum=sum+float(eachline) 
    sumsq=sumsq+float(eachline)*float(eachline) 

mean=sum/i 
sd=math.sqrt((sumsq/i)-mean*mean) 
report.write("Mean: %.3f" % mean + '\n') 
report.write("SD: %.3f" % sd) 

fs.close() 
fn.close() 
st.close() 
et.close() 
tt.close() 
ts.close() 
os.remove("pre_st.log") 
os.remove("pre_et.log") 
os.remove("fs.log") 
os.remove("fn.log") 
os.remove("st.log") 
os.remove("et.log") 
os.remove("tt.log") 
os.remove("ts.log") 

raw_input("\nThe log is analysed & the report is stored in report.log.") 
raw_input("\nPress the enter key to exit.") 

輸出是:

Name of the files & corresponding no of chunks 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
AUD_16_1122404893_7156305832_20131013215115.out 801 
AUD_30_3822029472_1442025768_20140923053708.out 801 
AUD_59_3579916998_7069213690_20130110135656.out 801 
AUD_61_3949868720_1329085991_20140304201127.out 801 
AUD_69_4708795896_5381639942_20151207235120.out 801 
AUD_84_7923993614_7033456750_20140701194619.out 801 
AUD_91_1017552763_1580925238_20150504230140.out 801 
IMG_08_1835175262_0348482033_20151030013511.out 801 
IMG_58_1221213139_0317767252_20140224102055.out 801 
IMG_85_5493890382_1220034673_20150124103303.out 801 
TXT_39_6233284218_9061455668_20140311141700.out 801 
TXT_44_0723897111_3504488735_20130221025651.out 801 
TXT_44_6371464119_5534251140_20130627035614.out 801 
TXT_46_4298501313_8016306909_20130904123345.out 801 
TXT_58_8366996486_3506161308_20131016091717.out 801 
TXT_62_4001716298_7792106100_20130219205926.out 801 
VID_05_6236865421_4023459915_20141130065941.out 801 
VID_27_7325125621_7609682883_20151006221708.out 801 
VID_33_6052974216_0442981490_20130712161750.out 801 
VID_53_4632285308_2718774440_20140227033833.out 801 

Total no of files 
~~~~~~~~~~~~~~~~~ 
20 

Size of the files (in bytes) 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
AUD_16_1122404893_7156305832_20131013215115.out 52435165 
AUD_30_3822029472_1442025768_20140923053708.out 52435165 
AUD_59_3579916998_7069213690_20130110135656.out 52435165 
AUD_61_3949868720_1329085991_20140304201127.out 52435165 
AUD_69_4708795896_5381639942_20151207235120.out 52435165 
AUD_84_7923993614_7033456750_20140701194619.out 52435165 
AUD_91_1017552763_1580925238_20150504230140.out 52435165 
IMG_08_1835175262_0348482033_20151030013511.out 52435165 
IMG_58_1221213139_0317767252_20140224102055.out 52435165 
IMG_85_5493890382_1220034673_20150124103303.out 52435165 
TXT_39_6233284218_9061455668_20140311141700.out 52435165 
TXT_44_0723897111_3504488735_20130221025651.out 52435165 
TXT_44_6371464119_5534251140_20130627035614.out 52435165 
TXT_46_4298501313_8016306909_20130904123345.out 52435165 
TXT_58_8366996486_3506161308_20131016091717.out 52435165 
TXT_62_4001716298_7792106100_20130219205926.out 52435165 
VID_05_6236865421_4023459915_20141130065941.out 52435165 
VID_27_7325125621_7609682883_20151006221708.out 52435165 
VID_33_6052974216_0442981490_20130712161750.out 52435165 
VID_53_4632285308_2718774440_20140227033833.out 52435165 

Size of the chunks of the files (in bytes) 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
AUD_16_1122404893_7156305832_20131013215115.out 65462.13 
AUD_30_3822029472_1442025768_20140923053708.out 65462.13 
AUD_59_3579916998_7069213690_20130110135656.out 65462.13 
AUD_61_3949868720_1329085991_20140304201127.out 65462.13 
AUD_69_4708795896_5381639942_20151207235120.out 65462.13 
AUD_84_7923993614_7033456750_20140701194619.out 65462.13 
AUD_91_1017552763_1580925238_20150504230140.out 65462.13 
IMG_08_1835175262_0348482033_20151030013511.out 65462.13 
IMG_58_1221213139_0317767252_20140224102055.out 65462.13 
IMG_85_5493890382_1220034673_20150124103303.out 65462.13 
TXT_39_6233284218_9061455668_20140311141700.out 65462.13 
TXT_44_0723897111_3504488735_20130221025651.out 65462.13 
TXT_44_6371464119_5534251140_20130627035614.out 65462.13 
TXT_46_4298501313_8016306909_20130904123345.out 65462.13 
TXT_58_8366996486_3506161308_20131016091717.out 65462.13 
TXT_62_4001716298_7792106100_20130219205926.out 65462.13 
VID_05_6236865421_4023459915_20141130065941.out 65462.13 
VID_27_7325125621_7609682883_20151006221708.out 65462.13 
VID_33_6052974216_0442981490_20130712161750.out 65462.13 
VID_53_4632285308_2718774440_20140227033833.out 65462.13 

Sync starting time 
~~~~~~~~~~~~~~~~~~ 
AUD_16_1122404893_7156305832_20131013215115.out 12:10:43.117 
AUD_30_3822029472_1442025768_20140923053708.out 12:11:03.951 
AUD_59_3579916998_7069213690_20130110135656.out 12:12:10.933 
AUD_61_3949868720_1329085991_20140304201127.out 12:12:34.120 
AUD_69_4708795896_5381639942_20151207235120.out 12:12:56.050 
AUD_84_7923993614_7033456750_20140701194619.out 12:13:20.291 
AUD_91_1017552763_1580925238_20150504230140.out 12:13:41.095 
IMG_08_1835175262_0348482033_20151030013511.out 12:14:12.536 
IMG_58_1221213139_0317767252_20140224102055.out 12:14:22.594 
IMG_85_5493890382_1220034673_20150124103303.out 12:14:22.619 
TXT_39_6233284218_9061455668_20140311141700.out 12:14:22.644 
TXT_44_0723897111_3504488735_20130221025651.out 12:14:22.670 
TXT_44_6371464119_5534251140_20130627035614.out 12:14:22.695 
TXT_46_4298501313_8016306909_20130904123345.out 12:14:22.720 
TXT_58_8366996486_3506161308_20131016091717.out 12:14:22.745 
TXT_62_4001716298_7792106100_20130219205926.out 12:14:22.770 
VID_05_6236865421_4023459915_20141130065941.out 12:14:22.795 
VID_27_7325125621_7609682883_20151006221708.out 12:18:16.565 
VID_33_6052974216_0442981490_20130712161750.out 12:18:36.599 
VID_53_4632285308_2718774440_20140227033833.out 12:18:56.971 

Sync ending time 
~~~~~~~~~~~~~~~~ 
AUD_16_1122404893_7156305832_20131013215115.out 12:11:03.951 
AUD_30_3822029472_1442025768_20140923053708.out 12:12:10.933 
AUD_59_3579916998_7069213690_20130110135656.out 12:12:34.120 
AUD_61_3949868720_1329085991_20140304201127.out 12:12:56.050 
AUD_69_4708795896_5381639942_20151207235120.out 12:13:20.291 
AUD_84_7923993614_7033456750_20140701194619.out 12:13:41.095 
AUD_91_1017552763_1580925238_20150504230140.out 12:15:45.028 
IMG_08_1835175262_0348482033_20151030013511.out 12:14:46.928 
IMG_58_1221213139_0317767252_20140224102055.out 12:15:08.705 
IMG_85_5493890382_1220034673_20150124103303.out 12:15:30.388 
TXT_39_6233284218_9061455668_20140311141700.out 12:16:06.094 
TXT_44_0723897111_3504488735_20130221025651.out 12:16:27.998 
TXT_44_6371464119_5534251140_20130627035614.out 12:16:48.881 
TXT_46_4298501313_8016306909_20130904123345.out 12:17:09.655 
TXT_58_8366996486_3506161308_20131016091717.out 12:17:34.825 
TXT_62_4001716298_7792106100_20130219205926.out 12:17:56.456 
VID_05_6236865421_4023459915_20141130065941.out 12:18:16.565 
VID_27_7325125621_7609682883_20151006221708.out 12:18:36.599 
VID_33_6052974216_0442981490_20130712161750.out 12:18:56.972 
VID_53_4632285308_2718774440_20140227033833.out 12:19:19.166 

Time required for syncing (in seconds) 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
AUD_16_1122404893_7156305832_20131013215115.out 20 
AUD_30_3822029472_1442025768_20140923053708.out 66 
AUD_59_3579916998_7069213690_20130110135656.out 23 
AUD_61_3949868720_1329085991_20140304201127.out 21 
AUD_69_4708795896_5381639942_20151207235120.out 24 
AUD_84_7923993614_7033456750_20140701194619.out 20 
AUD_91_1017552763_1580925238_20150504230140.out 123 
IMG_08_1835175262_0348482033_20151030013511.out 34 
IMG_58_1221213139_0317767252_20140224102055.out 46 
IMG_85_5493890382_1220034673_20150124103303.out 67 
TXT_39_6233284218_9061455668_20140311141700.out 103 
TXT_44_0723897111_3504488735_20130221025651.out 125 
TXT_44_6371464119_5534251140_20130627035614.out 146 
TXT_46_4298501313_8016306909_20130904123345.out 166 
TXT_58_8366996486_3506161308_20131016091717.out 192 
TXT_62_4001716298_7792106100_20130219205926.out 213 
VID_05_6236865421_4023459915_20141130065941.out 233 
VID_27_7325125621_7609682883_20151006221708.out 20 
VID_33_6052974216_0442981490_20130712161750.out 20 
VID_53_4632285308_2718774440_20140227033833.out 22 

Average syncing speed (in KBPS) 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
2560.31 
775.85 
2226.36 
2438.39 
2133.59 
2560.31 
416.31 
1506.07 
1113.18 
764.27 
497.15 
409.65 
350.73 
308.47 
266.70 
240.40 
219.77 
2560.31 
2560.31 
2327.56 

Some statistics regarding syncing speed 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
Mean: 1311.785 
SD: 957.979 

我知道它沒有接近理想的方法,但現在我可以用這個管理。但我期待着改進它。感謝幫助!