複製部分

我需要從下面的文本文件中提取值：複製部分

fdsjhgjhg 
fdshkjhk 
Start 
Good Morning 
Hello World 
End 
dashjkhjk 
dsfjkhk

我需要提取的值是從開始到結束。

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile: 
    copy = False 
    for line in infile: 
     if line.strip() == "Start": 
      copy = True 
     elif line.strip() == "End": 
      copy = False 
     elif copy: 
      outfile.write(line)

上面我使用的代碼是從這樣一個問題： Extract Values between two strings in a text file using python

該代碼將不包含字符串「開始」和「結束」只是裏面是什麼他們。你會如何包含周邊字符串？

來源

2016-03-02 johnnydrama

我會用多正則表達式爲 - 的代碼也將尋找更容易 – MaxU

@en_Knight幾乎是正確的。這裏有一個修復，以滿足業務方案的要求，即分隔符包含在輸出：

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile: 
    copy = False 
    for line in infile: 
     if line.strip() == "Start": 
      copy = True 
     if copy: 
      outfile.write(line) 
     # move this AFTER the "if copy" 
     if line.strip() == "End": 
      copy = False

或者乾脆包括寫（），它適用於情況：

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile: 
    copy = False 
    for line in infile: 
     if line.strip() == "Start": 
      outfile.write(line) # add this 
      copy = True 
     elif line.strip() == "End": 
      outfile.write(line) # add this 
      copy = False 
     elif copy: 
      outfile.write(line)

更新：到回答在評論這個問題：「只能用‘結束’一號次數後‘開始’」，最後elif line.strip() == "End"更改爲：

 elif line.strip() == "End" and copy: 
      outfile.write(line) # add this 
      copy = False

這如果只有一個「開始」，但多個「結束」......這聽起來很奇怪，但這是提問者所問的。

來源

2016-03-02 21:33:38

這使得有很大的意義。是否可以有選擇性地結束複製，僅在'開始'之後使用'結束'的第一次出現。我的文件包含多個字符串'End'？ – johnnydrama

「elif」means「只有在其他情況失敗時才這樣做」。它在語法上等同於「else if」，if you're coming from a differnet C-like語言。沒有它，秋季應該照顧包括「開始」和「結束」

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile: 
    copy = False 
    for line in infile: 
     if line.strip() == "Start": 
      copy = True 
     if copy: # flipped to include end, as Dan H pointed out 
      outfile.write(line) 
     if line.strip() == "End": 
      copy = False

來源

2016-03-02 21:30:22

正則表達式的方法：

import re 

with open('input.txt') as f: 
    data = f.read() 

match = re.search(r'\n(Start\n.*?\nEnd)\n', data, re.M | re.S) 
if match: 
    with open('output.txt', 'w') as f: 
     f.write(match.group(1))

來源

2016-03-02 21:44:33 MaxU

這可能是更強大的解決方案，但對於elif v如果不清楚的人，也許可以包含一些文字描述？ –

這樣比較好：'（^ Start [\ s \ S] +^End）'[Demo]（https://regex101.com/r/gT0eR6/1）（或'（^ Start [\ s \ S] +？^ End）'如果有多於1個'End' ...） – dawg

回答

相關問題