如何保持文件中的行數據，直到文件python

以後遇到條件我懷疑這是一個重複的問題，但我已經搜索了一段時間，似乎沒有正確的措辭找到這個問題的答案。對不起，如果它是提前重複！如何保持文件中的行數據，直到文件python

我試圖從我正在逐行閱讀的文件中打印下列信息。

基因1基因2基因0 *基因1基因2

*代碼

我已經能夠得到gene0，基因1，基因2被稱爲非編碼RNA基因，但我有麻煩試圖找出如何緩衝基因-1和基因-2，直到符合條件基因0（數據[2] = ncRNA）。

換句話說，我需要從先前的行中獲得可變信息，但只有當滿足當前行中的條件時纔可以。我已經在下面的註釋部分中想到了它，但似乎必須有更好的方法來做到這一點（這將是一個嵌套混亂）。我正在瀏覽的文件是一個gff文件。

我不知道如何爲'先前的信息'做一個佔位符，直到滿足條件。

import sys 
import re 
gff3 = sys.argv[1] 
f = open(gff3, 'r') 

ncRNAgene= False 
fgene_count=0 

while True: 
    line = f.readline() 
    if not line.startswith('#'): 
     data = line.strip().split("\t") 
     ### this is not important to the question, just me dealing with the file format 
     try: 
      #my mis-guided attempts to get at this issue 
      #if data[2] == gene: 
      #line0 = f.readline() 
      #data0 = line.strip().split("\t") 
      #if data0[2] == gene 


     ### the relevant information is in the third column of the line 
      if data[2] == 'ncRNA': 
       ncRNAgene = True 

       print "ncRNA gene:", line 

       while fgene_count <= 1 and ncRNAgene: 
        line = f.readline() 
        data2 = line.strip().split("\t") 
        if data2[2] == 'gene': 
         fgene_count = fgene_count + 1 

         print "this is gene %s : %s" %(fgene_count, line) 

      if fgene_count > 1: 
       fgene_count = 0 
       ncRNAgene= False 

      else: 
       continue 

    except IndexError: 
      if line.startswith('>'): 
       break 
    if not line: 
     break 

f.close()

這是我很感興趣的樣子文件的一部分：我在方括號內的東西，我很感興趣，

211000022279165 FlyBase [外顯子] 14 1118。 - 。父母= FBtr0300167; parent_type = ncRNA

211000022279165 FlyBase [基因] 14 1118。 - 。 ID = FBgn0259870;名稱= Su（Ste）：CR42439;全名= Su（Ste）：CR42439;別名= CR42439; Ontology_term = SO：0000011，SO：0000087; Dbxref = FlyBase_Annotation_IDs：CR42439，EntrezGene：7354392，GenomeRNAi：7354392

211000022279165 FlyBase [ncRNA] 14 1118。 - 。 ID = FBtr0300167;名稱=蘇（STE）：CR42439-RA;父= FBgn0259870; ALIAS = CR42439-RA; Dbxref = FlyBase_Annotation_IDs：CR42439-RA，REFSEQ：NR_026633; score_text =弱支持;得分= 0

來源

2014-12-05 wubbina

這是您的實際縮進？因爲剛纔所引發的問題很多，你必須先解決。即使該行以'＃'開始，這意味着您將重新使用前一行的「data」，您已經有了試圖使用'data'的代碼。你有一個'else'，看起來它的目的是匹配'try'而不是'if'，這並不意味着什麼。等等。 – abarnert 2014-12-05 19:39:53

同時，從描述中不完全清楚你想要什麼。你的描述中有哪些「gene-1」和「gene-2」？ – abarnert 2014-12-05 19:40:37

感謝您的意見。註釋部分不是我的實際縮進，我會看看我是否可以修復其餘部分。對不起，目前尚不清楚基因-1和基因-2是什麼。我基本上試圖找到ncRNA（或gene0）周圍區域的'基因'信息。 – wubbina 2014-12-05 19:55:59

這裏很難準確地說出你的意思，但是像這樣的問題的一般想法非常簡單：只需將gene1和gene2存儲在本地變量中，只要找到gene1或gene2行，就更新本地變量，然後在您使用這些局部變量時找到gene0一行。

例如：

gene1, gene2 = None, None 
for line in file: 
    if matches_gene1(line): 
     gene1 = parse_gene1(line) 
    elif matches_gene2(line): 
     gene2 = parse_gene2(line) 
    elif matches_gene0(line): 
     gene0 = parse_gene0(line) 
     do_stuff_with(gene0, gene1, gene2) 
     gene1, gene2 = None, None

或者，如果可以有多個各gene0之前gene1和gene2線，只是用它們的列表：

gene1, gene2 = [], [] 
for line in file: 
    if matches_gene1(line): 
     gene1.append(parse_gene1(line)) 
    elif matches_gene2(line): 
     gene2.append(parse_gene2(line)) 
    elif matches_gene0(line): 
     gene0 = parse_gene0(line) 
     do_stuff_with(gene0, gene1, gene2) 
     gene1, gene2 = [], []

來源

2014-12-05 19:43:23 abarnert

如何保持文件中的行數據，直到文件python

回答

相關問題