2017-12-18 254 views
-4

文件的某些價值觀我有這樣一個文本文件(這是一個樣本,在非常大的實際文件):計數在python

[52639 - 2017-12-08 11:56:58,680] INFO __main__.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67') 
[52639 - 2017-12-08 11:57:37,686] INFO __main__.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18') 
[52639 - 2017-12-08 11:58:46,984] INFO __main__.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65') 
[52639 - 2017-12-08 12:01:10,073] INFO __main__.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38') 
[52639 - 2017-12-08 12:03:37,570] INFO __main__.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68') 

,我想最後一個逗號前計數的值。結果將是665 + 223 + 1052 + 541 + 1303 = 3784.

我想不出如何實現這一點。任何幫助將不勝感激。

回答

0

在這裏,你可以試試這個:

summation = 0 

with open("test.txt", "r") as infile: 
    for line in infile: 
     newLine = line.split(", ") 
     summation = summation + int(newLine[3]) 

print(summation) 

輸出:

3784 

test.txt文件的內容結構是這樣的:

[52639 - 2017-12-08 11:56:58,680] INFO main.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67') 
[52639 - 2017-12-08 11:57:37,686] INFO main.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18') 
[52639 - 2017-12-08 11:58:46,984] INFO main.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65') 
[52639 - 2017-12-08 12:01:10,073] INFO main.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38') 
[52639 - 2017-12-08 12:03:37,570] INFO main.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68') 

如果你想打印所有的數字,使得總和,你可以使用一個列表來存儲每個數字:

summation = 0 
coefficients = [] 

with open("test.txt", "r") as infile: 
    for line in infile: 
     newLine = line.split(", ") 
     coefficients.append(newLine[3]) 
     summation = summation + int(newLine[3]) 

print("+".join(coefficients), end="=") 
print(summation) 

輸出:

665+223+1052+541+1303=3784 
+0

謝謝vasilis。他們提出了問題-4,可能是想添加2行代碼,如「打開」或類似的東西。這就是我們大多數人討厭stackoverflow的原因。他們假裝他們不是。無論如何謝謝你。 – Antonis

+0

嗨vasilis。假設我的fil這樣更復雜,假設我有這樣的文件和行:[52639 - 2017-12-08 11:43:44,850]信息__main __。master 251完成pre-smap蛋白標籤('4py6', ['R78','EDO'],35000,33.207404136657715,'16')或[52639 - 2017-12-08 11:43:48,014] INFO __main __ master 251完成的pre-smap蛋白標籤('1nw4',[ 'IMH','IPA','SO4'],3500,153.33520197868347,'64')。你有解決方案嗎? – Antonis

+0

@Antonis,社區傾向於欣賞顯示研究工作的問題,並挑戰其成員試圖找出最佳解決方案。此外,Stackoverflow是一個廣泛的社區,從新手到高度熟練的成員都想要更進一步。所以,顯示缺乏努力的問題或者可能重複的問題都沒有吸引力。但是,無論如何,謝謝你接受我的答案。 –

0
import re 
s = """ 
[52639 - 2017-12-08 11:56:58,680] INFO main.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67') 

[52639 - 2017-12-08 11:57:37,686] INFO main.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18') 

[52639 - 2017-12-08 11:58:46,984] INFO main.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65') 

[52639 - 2017-12-08 12:01:10,073] INFO main.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38') 

[52639 - 2017-12-08 12:03:37,570] INFO main.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68') 
""" 

pattern = ', ([0-9]*), \'[0-9]*\'\)' 

print sum(int(i) for i in re.findall(pattern,s)) 

您是否嘗試過使用正則表達式庫?通過構建一個與「用括號關閉的數字之前的數字」匹配的模式,可以捕獲所有這些數字,然後構建一個將它們轉換爲整數的生成器,並將它們相加。