2014-09-25 45 views
2

問題陳述:不正確使用輸出應用re.sub()

插入的期間,如果期間直接後跟一個字母之後的額外的空間。

下面是代碼:

string="This is very funny and cool.Indeed!" 

re.sub("\.[a-zA-Z]", ". ", string) 

和輸出:

"This is very funny and cool. ndeed!" 

'.'後更換的第一個字符。

任何解決方案?

+0

嘗試使用捕獲組 – jaap3 2014-09-25 13:41:08

回答

3

您可以使用positivie lookahead assertion,不消耗匹配的部分:

>>> re.sub(r"\.(?=[a-zA-Z])", ". ", string) 
'This is very funny and cool. Indeed!' 

使用capturing group and backreference備選:

>>> re.sub(r"\.([a-zA-Z])", r". \1", string) # NOTE - r"raw string literal" 
'This is very funny and cool. Indeed!' 

僅供參考,您可以使用\S代替[A-ZA-Z]以匹配非空格字符。

+1

瞭解到新事物。 +1 – 2014-09-25 13:49:19

0

你也可以在你的正則表達式中同時使用lookahead and lookbehind

>>> import re 
>>> string="This is very funny and cool.Indeed!" 
>>> re.sub(r'(?<=\.)(?=[A-Za-z])', r' ', string) 
'This is very funny and cool. Indeed!' 

OR

您可以使用\b

>>> re.sub(r'(?<=\.)\b(?=[A-Za-z])', r' ', string) 
'This is very funny and cool. Indeed!' 

說明:

  • (?<=\.)只要查看文字點之後。
  • (?=[A-Za-z])斷言匹配的邊界後面必須跟一個字母。
  • 如果是,則用空格替換邊界。