Python：翻譯/替換字符串不是你想要的字

基本上，我有一堆短語，我只對那些包含某些單詞的單詞感興趣。我想要做的是1）找出那個單詞是否存在，如果是，2）刪除所有其他單詞。我可以用一堆if和for來做到這一點，但我想知道是否會有一個簡短/ pythonic的方法。Python：翻譯/替換字符串不是你想要的字

來源

2010-11-01 dms

一個建議的算法：

對於每個短語
1. 找到有趣的詞是否有
2. 如果是，清除所有換句話說
3. 否則，只是繼續到下一個樂句

是的，執行此操作需要「一堆ifs和fors」，但是您會驚訝地發現這樣的邏輯如何輕鬆乾淨地轉換爲Python。

實現此目的的更簡潔的方法是使用列表理解，從而將這種邏輯稍微扁平化。鑑於phrases是短語的列表：

phrases = [process(p) if isinteresting(p) else p for p in phrases]

有關的process和isinteresting功能的合適的定義。

來源

2010-11-01 04:55:10

我希望使用與正則表達式或其他方法，使之更加簡潔的翻譯。你的解決方案比我的解決方案更清潔，所以謝謝 – dms 2010-11-01 04:59:39

@dms：'translate'並不是爲此目的而設計的，雖然我們在摔角正則表達式以使它在理論上可行，但我認爲它不會更好，因爲'words'是一個有趣的單詞列表，'isinteresting（）'變成了'any（單詞中的單詞）''，所以比我提議的 – 2010-11-01 05:03:15

更接近pythonic。 – 2010-11-01 09:17:00

基於正則表達式的解決方案：

>>> import re 
>>> phrase = "A lot of interesting and boring words" 
>>> regex = re.compile(r"\b(?!(?:interesting|words)\b)\w+\W*") 
>>> clean = regex.sub("", phrase) 
>>> clean 
'interesting words'

正則表達式的工作原理如下：

\b    # start the match at a word boundary 
(?!   # assert that it's not possible to match 
(?:   # one of the following: 
    interesting # "interesting" 
    |   # or 
    words  # "words" 
)    # add more words if desired... 
\b   # assert that there is a word boundary after our needle matches 
)    # end of lookahead 
\w+\W*   # match the word plus any non-word characters that follow.

來源

2010-11-01 08:03:28

Python：翻譯/替換字符串不是你想要的字

回答

相關問題