用Python解析文本

我在文本文件中有如下示例數據的數據。我想要做的是通過文本文件搜索並返回「SpecialStuff」和下一個「;」之間的所有內容，就像我對輸出示例所做的那樣。我對python非常陌生，因此非常感謝任何提示，就像.split（）一樣工作？用Python解析文本

Example Data: 

stuff: 
    1 
    1 
    1 
    23 

]; 

otherstuff: 
    do something 
    23 
    4 
    1 

]; 

SpecialStuff 
    select 
     numbers 
     ,othernumbers 
     words 
; 

MoreOtherStuff 
randomstuff 
@#123 


Example Out Put: 

select 
     numbers 
     ,othernumbers 
     words

來源

2017-06-19 user3476463

你可以試試這個：

file = open("filename.txt", "r") # This opens the original file 
output = open("result.txt", "w") # This opens a new file to write to 
seenSpecialStuff = 0 # This will keep track of whether or not the 'SpecialStuff' line has been seen. 
for line in file: 
    if ";" in line: 
     seenSpecialStuff = 0 # Set tracker to 0 if it sees a semicolon. 
    if seenSpecialStuff == 1: 
     output.write(line) # Print if tracker is active 
    if "SpecialStuff" in line: 
     seenSpecialStuff = 1 # Set tracker to 1 when SpecialStuff is seen

這將返回一個名爲的Result.txt文件，其中包含：

select 
    numbers 
    ,othernumbers 
    words

這個代碼可以改善！由於這可能是一項家庭作業，您可能需要進一步研究如何使其更高效。希望它能成爲你的有用起跑點！

乾杯！

編輯

如果你想要的代碼來具體看行「SpecialStuff」（而不是包含「SpecialStuff」行的），你可以很容易地改變了「if」語句，以使其更具針對性：

file = open("my.txt", "r") output = open("result.txt", "w") seenSpecialStuff = 0 for line in file: if line.replace("\n", "") == ";": seenSpecialStuff = 0 if seenSpecialStuff == 1: output.write(line) if line.replace("\n", "") == "SpecialStuff": seenSpecialStuff = 1

來源

2017-06-19 18:23:49 cosinepenguin

謝謝，這真的很接近我正在尋找的東西。唯一的問題是代碼中有些部分的字符串像「abcSpecialStuffpdq」，所以它抓住了後面的所有內容。我怎樣才能更改代碼，使其只抓取字符串「SpecialStuff」後面的內容？ – user3476463

你可以嘗試使if語句的內容像'if line.replace（「\ n」，「」）==「SpecialStuff」：'，這樣就可以使得只有其中包含SpecialStuff的行會觸發器使跟蹤器「1」！如果你只想找到特定的事件，也可以爲其他行完成。 – cosinepenguin

我編輯了答案來反映這一點！如果您以後還需要獲取「abcSpecialStuffpdq」中包含的信息，則必須添加單獨的「if」語句，以便代碼能夠識別它。 – cosinepenguin

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile: # open the input and output files 
    wanted = False # do we want the current line in the output? 
    for line in infile: 
     if line.strip() == "SpecialStuff": # marks the begining of a wanted block 
      wanted = True 
      continue 
     if line.strip() == ";" and wanted: # marks the end of a wanted block 
      wanted = False 
      continue 

     if wanted: outfile.write(line)

來源

2017-06-19 18:25:16 inspectorG4dget

不要使用str.split()爲 - str.find()是綽綽有餘：

parsed = None 
with open("example.dat", "r") as f: 
    data = f.read() # load the file into memory for convinience 
    start_index = data.find("SpecialStuff") # find the beginning of your block 
    if start_index != -1: 
     end_index = data.find(";", start_index) # find the end of the block 
     if end_index != -1: 
      parsed = data[start_index + 12:end_index] # grab everything in between 
if parsed is None: 
    print("`SpecialStuff` Block not found") 
else: 
    print(parsed)

請記住，這將捕獲一切這兩個之間，包括新線和其他空白 - 你還可以做parsed.strip()除去開頭和結尾的空格，如果你不希望他們。

來源

2017-06-19 18:33:16 zwer

用Python解析文本

回答

相關問題