這個單詞替換函數是如何工作的？

-3

import re 
def multiwordReplace(text, wordDic): 
    rc = re.compile('|'.join(map(re.escape, wordDic)))) 
    def translate(match): 
     return wordDic[match.group(0)] 
    return rc.sub(translate, text)

此代碼從另一個源被複制，但我對如何替換文本段落的話不確定，不明白爲什麼「重」的功能在這裏使用這個單詞替換函數是如何工作的？

來源

2016-05-13 pineapple854

您應該閱讀[正則表達式]（https://docs.python.org/2/howto/regex.html）。 –

我們應該如何處理這些問題？這不是像[正則表達式的意思]（http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean）的問題，但類似。 –

一塊一塊...

# Our dictionary 
wordDic = {'hello': 'foo', 'hi': 'bar', 'hey': 'baz'} 

# Escape every key in dictionary with regular expressions' escape character. 
# Escaping is requred so that possible special characters in 
# dictionary words won't mess up the regex 
map(re.escape, wordDic) 

# join all escaped key elements with pipe | to make a string 'hello|hi|hey' 
'|'.join(map(re.escape, wordDic)) 

# Make a regular expressions instance with given string. 
# the pipe in the string will be interpreted as "OR", 
# so our regex will now try to find "hello" or "hi" or "hey" 
rc = re.compile('|'.join(map(re.escape, wordDic)))

所以RC現在與匹配的話中有字典和rc.sub替換給定字符串中的那些單詞。當正則表達式返回匹配時，翻譯函數僅返回該鍵的對應值。

來源

2016-05-13 10:41:16

re.compile() - 將表達式字符串編譯爲正則表達式對象。該字符串由worDic的連接鍵與分隔符|組成。給定一個wordDic{'hello':'hi', 'goodbye': 'bye'}字符串將是「你好|喜」，這可以tranlated爲「Hello 或喜」
def translate(match): - 定義將處理每場比賽
rc.sub(translate, text)一個回調函數 - Performes的字符串替換。如果正則表達式匹配文本，則通過回調在wordDic中查找匹配項（實際上是wordDic的鍵），並返回翻譯。

實施例：

wordDic = {'hello':'hi', 'goodbye': 'bye'} 
text = 'hello my friend, I just wanted to say goodbye' 
translated = multiwordReplace(text, wordDic) 
print(translated)

輸出是：

hi my friend, I just wanted to say bye

EDIT

使用re.compile()雖然的主要優點是性能增益，如果使用該正則表達式中多次。由於每個函數調用都編譯正則表達式，因此沒有任何收益。如果wordDic被多次使用，您生成一個wordDic功能multiwordReplace和編譯只是做一次：

import re 
def generateMwR(wordDic): 
    rc = re.compile('|'.join(map(re.escape, wordDic))) 
    def f(text): 
     def translate(match): 
      print(match.group(0)) 
      return wordDic[match.group(0)] 
     return rc.sub(translate, text) 
    return f

用法是這樣的：

wordDic = {'hello': 'hi', 'goodbye': 'bye'} 
text = 'hello my friend, I just wanted to say goodbye' 
f = generateMwR(wordDic) 
translated = f(text)

來源

2016-05-13 10:37:06 dron22

這個單詞替換函數是如何工作的？

回答

相關問題