Python正則表達式findall

我想在Python 2.7.2中使用正則表達式從字符串中提取所有出現的標記詞。或者乾脆，我想提取[p][/p]標籤中的每一段文字。這裏是我的嘗試：Python正則表達式findall

regex = ur"[\u005B1P\u005D.+?\u005B\u002FP\u005D]+?" 
line = "President [P] Barack Obama [/P] met Microsoft founder [P] Bill Gates [/P], yesterday." 
person = re.findall(pattern, line)

印刷person產生['President [P]', '[/P]', '[P] Bill Gates [/P]']

什麼是正確的正則表達式來獲得：['[P] Barack Obama [/P]', '[P] Bill Gates [/p]'] 或['Barrack Obama', 'Bill Gates']。

謝謝。 :)

來源

2011-10-13 Ignatius

import re 
regex = ur"\[P\] (.+?) \[/P\]+?" 
line = "President [P] Barack Obama [/P] met Microsoft founder [P] Bill Gates [/P], yesterday." 
person = re.findall(regex, line) 
print(person)

產生

['Barack Obama', 'Bill Gates']

正則表達式ur"[\u005B1P\u005D.+?\u005B\u002FP\u005D]+?"完全相同 Unicode作爲除更難閱讀u'[[1P].+?[/P]]+?'。

第一個括號組[[1P]告訴re任何列表['[', '1', 'P']的字符應匹配，並且同樣與第二組括號。那[/P]]不是你想要的所有東西。所以，

刪除外圍方括號。（另外在P前面取下雜散1。）
爲了保護字面括號中[P]，逃生用反斜槓括號：\[P\]。
要僅返回標籤內的單詞，請在.+?附近放置分組圓括號。

來源

2011-10-13 10:20:25 unutbu

試試這個：

for match in re.finditer(r"\[P[^\]]*\](.*?)\[/P\]", subject): 
     # match start: match.start() 
     # match end (exclusive): match.end() 
     # matched text: match.group()

來源

2011-10-13 10:21:12 FailedDev

我真的很喜歡這個答案。如果你只想處理匹配，那麼這樣做不需要像1）保存列表，2）處理列表不等於str = blah洗碗機' ##這裏re.findall（）返回所有找到的電子郵件字符串列表 emails = re.findall（r'[\ w \ .-] + @ [\ w \ .-] +'， str）## ['[email protected]'，'bob @ abc。com'] 用於電子郵件中的電子郵件：＃對每個找到的電子郵件字符串做一些操作打印電子郵件 – kkron

你的問題是不是100％清楚，但我假設你想找到的每一段文字裏面[P][/P]標籤：

>>> import re 
>>> line = "President [P] Barack Obama [/P] met Microsoft founder [P] Bill Gates [/P], yesterday." 
>>> re.findall('\[P\]\s?(.+?)\s?\[\/P\]', line) 
['Barack Obama', 'Bill Gates']

來源

2011-10-13 10:24:22 Blair

可以用

替換您的圖案

regex = ur"\[P\]([\w\s]+)\[\/P\]"

來源

2011-10-13 10:31:59 pram

注意您的格式; *使用預覽區域*。因爲你沒有正確格式化，所以反斜槓是亂碼（Markdown就像那樣差）。 –

你爲什麼要用'[\ w \ s] +'而不是'。*？'這是他用的？對我來說'無論如何''*？'更可能是他想要的東西。 '[\ w \ s]'是可怕的限制。 –

故意的限制。我使用[\ w \ s] +，因爲提交者顯然希望提取很少包含數字的名稱。還要注意提問者想提取單詞，而不是數字。只是我的意見，儘管如此，cmiiw – pram

使用此模式，

pattern = '\[P\].+?\[\/P\]'

檢查here

來源

2016-07-18 06:16:44 Sohn

Python正則表達式findall

回答

相關問題