2017-08-28 82 views
0

我是非常新的python,我有一個python腳本運行特定文件(input1.txt)並生成一個輸出(output1.fasta),但我想運行該腳本的多個文件,例如:input2.txt,input3.txt ......併產生相應的輸出:output2.fasta,output3.fasta修改python腳本運行多個輸入文件

from Bio import SeqIO 

fasta_file = "sequences.txt" 
wanted_file = "input1.txt" 
result_file = "output1.fasta" 

wanted = set() 
with open(wanted_file) as f: 
    for line in f: 
     line = line.strip() 
     if line != "": 
      wanted.add(line) 
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta') 
with open(result_file, "w") as f: 
    for seq in fasta_sequences: 
     if seq.id in wanted: 
      SeqIO.write([seq], f, "fasta") 

我嘗試添加水珠功能,但我不知道如何處理輸出文件名。

from Bio import SeqIO 
import glob 

fasta_file = "sequences.txt" 

for filename in glob.glob('*.txt'): 

    wanted = set() 
    with open(filename) as f: 
     for line in f: 
      line = line.strip() 
      if line != "": 
       wanted.add(line) 

    fasta_sequences = SeqIO.parse(open(fasta_file),'fasta') 
    with open(result_file, "w") as f: 
     for seq in fasta_sequences: 
      if seq.id in wanted: 
       SeqIO.write([seq], f, "fasta") 

的錯誤信息是:NameError:名字「result_file」沒有定義

+1

究竟是什麼「不工作」?你可以使用glob嘗試後顯示你的代碼嗎? – Verv

+1

什麼不適用於glob?具體,以便我們可以提供幫助。 – kabanus

+0

對不起,我更新了錯誤信息等問題。 – Paul

回答

2

glob正在拉你的「序列」的文件,以及輸入,因爲*.txt包括sequences.txt文件。如果「FASTA」文件始終是相同的,你只需要遍歷輸入文件,那麼你需要

for filename in glob.glob('input*.txt'): 

此外,通過你的整個過程進行迭代,也許你希望把它放在一個方法中。如果輸出文件名總是被創建爲對應於輸入,那麼您可以動態創建它。

from Bio import SeqIO 

def create_fasta_outputs(fasta_file, wanted_file): 
    result_file = wanted_file.replace("input","output").replace(".txt",".fasta") 

    wanted = set() 
    with open(wanted_file) as f: 
     for line in f: 
      line = line.strip() 
      if line != "": 
       wanted.add(line) 
    fasta_sequences = SeqIO.parse(open(fasta_file),'fasta') 
    with open(result_file, "w") as f: 
     for seq in fasta_sequences: 
      if seq.id in wanted: 
       SeqIO.write([seq], f, "fasta") 

fasta_file = "sequences.txt" 
for wanted_file in glob.glob('input*.txt'): 
    create_fasta_outputs(fasta_file, wanted_file) 
+0

是的,我的fasta_file =「sequences.txt」對於所有的輸入文件都是一樣的。您的命令運行時沒有任何錯誤,但它不會創建任何輸出。 – Paul

+0

您是否有'sequences.txt'和'input1.txt'文件的樣本數據?既然腳本會運行,那麼它內部的一些邏輯可能會被忽略,例如'if line!=「」:'或'如果seq.id in wanted:'導致缺少輸出。 – jack6e

+0

你是對的,我的錯。該命令正在運行!非常感謝。 – Paul