2017-06-19 99 views
0

我在文本文件中有如下示例數據的數據。我想要做的是通過文本文件搜索並返回「SpecialStuff」和下一個「;」之間的所有內容,就像我對輸出示例所做的那樣。我對python非常陌生,因此非常感謝任何提示,就像.split()一樣工作?用Python解析文本

Example Data: 

stuff: 
    1 
    1 
    1 
    23 

]; 

otherstuff: 
    do something 
    23 
    4 
    1 

]; 

SpecialStuff 
    select 
     numbers 
     ,othernumbers 
     words 
; 

MoreOtherStuff 
randomstuff 
@#123 


Example Out Put: 

select 
     numbers 
     ,othernumbers 
     words 

回答

1

你可以試試這個:

file = open("filename.txt", "r") # This opens the original file 
output = open("result.txt", "w") # This opens a new file to write to 
seenSpecialStuff = 0 # This will keep track of whether or not the 'SpecialStuff' line has been seen. 
for line in file: 
    if ";" in line: 
     seenSpecialStuff = 0 # Set tracker to 0 if it sees a semicolon. 
    if seenSpecialStuff == 1: 
     output.write(line) # Print if tracker is active 
    if "SpecialStuff" in line: 
     seenSpecialStuff = 1 # Set tracker to 1 when SpecialStuff is seen 

這將返回一個名爲的Result.txt文件,其中包含:

select 
    numbers 
    ,othernumbers 
    words 

這個代碼可以改善!由於這可能是一項家庭作業,您可能需要進一步研究如何使其更高效。希望它能成爲你的有用起跑點!

乾杯!

編輯

如果你想要的代碼來具體看行「SpecialStuff」(而不是包含「SpecialStuff」行的),你可以很容易地改變了「if」語句,以使其更具針對性:

file = open("my.txt", "r") 
output = open("result.txt", "w") 
seenSpecialStuff = 0 
for line in file: 
    if line.replace("\n", "") == ";": 
     seenSpecialStuff = 0 
    if seenSpecialStuff == 1: 
     output.write(line) 
    if line.replace("\n", "") == "SpecialStuff": 
     seenSpecialStuff = 1 
+0

謝謝,這真的很接近我正在尋找的東西。唯一的問題是代碼中有些部分的字符串像「abcSpecialStuffpdq」,所以它抓住了後面的所有內容。我怎樣才能更改代碼,使其只抓取字符串「SpecialStuff」後面的內容? – user3476463

+0

你可以嘗試使if語句的內容像'if line.replace(「\ n」,「」)==「SpecialStuff」:',這樣就可以使得只有其中包含SpecialStuff的行會觸發器使跟蹤器「1」!如果你只想找到特定的事件,也可以爲其他行完成。 – cosinepenguin

+0

我編輯了答案來反映這一點!如果您以後還需要獲取「abcSpecialStuffpdq」中包含的信息,則必須添加單獨的「if」語句,以便代碼能夠識別它。 – cosinepenguin

0
with open('path/to/input') as infile, open('path/to/output', 'w') as outfile: # open the input and output files 
    wanted = False # do we want the current line in the output? 
    for line in infile: 
     if line.strip() == "SpecialStuff": # marks the begining of a wanted block 
      wanted = True 
      continue 
     if line.strip() == ";" and wanted: # marks the end of a wanted block 
      wanted = False 
      continue 

     if wanted: outfile.write(line) 
0

不要使用str.split()爲 - str.find()是綽綽有餘:

parsed = None 
with open("example.dat", "r") as f: 
    data = f.read() # load the file into memory for convinience 
    start_index = data.find("SpecialStuff") # find the beginning of your block 
    if start_index != -1: 
     end_index = data.find(";", start_index) # find the end of the block 
     if end_index != -1: 
      parsed = data[start_index + 12:end_index] # grab everything in between 
if parsed is None: 
    print("`SpecialStuff` Block not found") 
else: 
    print(parsed) 

請記住,這將捕獲一切這兩個之間,包括新線和其他空白 - 你還可以做parsed.strip()除去開頭和結尾的空格,如果你不希望他們。