python re.findall奇怪的行爲

>>> text =\ 
... """xyxyxy testmatch0 
... xyxyxy testmatch1 
... xyxyxy 
... whyisthismatched1 
... xyxyxy testmatch2 
... xyxyxy testmatch3 
... xyxyxy 
... whyisthismatched2 
... """ 
>>> re.findall("^\s*xyxyxy\s+([a-z0-9]+).*$", text, re.MULTILINE) 
[u'testmatch0', u'testmatch1', u'whyisthismatched1', u'testmatch2', u'testmatch3', u'whyisthismatched2']

所以我的期望是不匹配包含「whyisthismatched」的行。python re.findall奇怪的行爲

Python的重新文檔狀態以下：

（圓點）在默認模式中，該除一個換行符任何字符匹配。如果已經指定了DOTALL標誌，則該標誌匹配任何包含換行符的字符。

我的問題是，如果這是真的預期的行爲或錯誤。如果預計有人請解釋爲什麼這些線路匹配，我應該如何修改我的模式來得到我期望的行爲：

[u'testmatch0', u'testmatch1', u'testmatch2', u'testmatch3']

來源

2013-04-09 ZergRush

換行符可以包括在\ s的re.MULTILINE ......我覺得至少 – 2013-04-09 16:37:35

換行空格也儘可能的\s字符類關注。如果你想匹配空間只需要匹配[ ]代替：

>>> re.findall("^\s*xyxyxy[ ]+([a-z0-9]+).*$", text, re.MULTILINE) 
[u'testmatch0', u'testmatch1', u'testmatch2', u'testmatch3']

來源

2013-04-09 16:37:12

呸你快：P一如既往（感謝對於我的答案:)） – 2013-04-09 16:39:08

我剛剛意識到，感謝您的快速幫助。 – ZergRush 2013-04-09 16:39:57

python re.findall奇怪的行爲

回答

相關問題