python正則表達式問題

搜索字符串內匹配單詞的最佳方法是什麼？python正則表達式問題

現在我這樣做如下：

if re.search('([h][e][l][l][o])',file_name_tmp, re.IGNORECASE):

哪些工作，但其緩慢的，因爲我有大概100個不同的正則表達式語句搜索完整的話，所以我想用幾個聯合收割機|分離器或什麼的。

來源

2010-10-18 Joe

'[]'裏面的單個字符是毫無意義的。得到一個體面的正則表達式的介紹，你似乎至少有一個，如果其最基本的部分... – delnan 2010-10-18 19:14:13

Fyi，你已經大大過度複雜的正則表達式。 '如果re.search（'hello'），file_name_tmp，re.IGNORECASE）'完全一樣。 – 2010-10-18 19:59:59

>>> words = ('hello', 'good\-bye', 'red', 'blue') 
>>> pattern = re.compile('(' + '|'.join(words) + ')', re.IGNORECASE) 
>>> sentence = 'SAY HeLLo TO reD, good-bye to Blue.' 
>>> print pattern.findall(sentence) 
['HeLLo', 'reD', 'good-bye', 'Blue']

來源

2010-10-18 19:52:56

+1好答案。不過，我認爲指出可用的邊界條件/選項也很重要。 – 2010-10-18 21:32:50

你可以嘗試：

if 'hello' in longtext:

或

if 'HELLO' in longtext.upper():

匹配打招呼/你好/ HELLO。

來源

2010-10-18 18:40:36 eumiro

或hELLo或HElLO或....;） – KevinDTimm 2010-10-18 18:54:55

... hElLo或hellO或... – slezica 2010-10-18 19:02:52

如果你想檢查「你好」或者一個字符串一個完整的字，你也可以做

if 'hello' in stringToMatch: 
    ... # Match found , do something

爲了找到各種串，你也可以使用查找所有

>>>toMatch = 'e3e3e3eeehellloqweweemeeeeefe' 
>>>regex = re.compile("hello|me",re.IGNORECASE) 
>>>print regex.findall(toMatch) 
>>>[u'me'] 
>>>toMatch = 'e3e3e3eeehelloqweweemeeeeefe' 
>>>print regex.findall(toMatch) 
>>>[u'hello', u'me'] 
>>>toMtach = 'e3e3e3eeeHelLoqweweemeeeeefe' 
>>>print regex.findall(toMatch) 
>>>[u'HelLo', u'me']

來源

2010-10-18 18:41:23 pyfunc

工作，但我仍然需要返回一組匹配的正則表達式功能，因爲有時字符串中的單詞是大寫或小寫 – Joe 2010-10-18 18:44:24

@Joe：在這種情況下，您可以使用正則表達式|聲明。看到我編輯的答覆 – pyfunc 2010-10-18 18:58:02

你說你想搜索單詞。你對「單詞」的定義是什麼？如果你正在尋找「見面」，你真的想要匹配「會議」中的「見面」嗎？如果沒有，你可能想嘗試這樣的：

>>> import re 
>>> query = ("meet", "lot") 
>>> text = "I'll meet a lot of friends including Charlotte at the town meeting" 
>>> regex = r"\b(" + "|".join(query) + r")\b" 
>>> re.findall(regex, text, re.IGNORECASE) 
['meet', 'lot'] 
>>>

的\b在每一端迫使它只能在單詞邊界匹配，使用的‘字’re「的定義 - ‘不’ISN一句話，這是用撇號分隔的兩個詞。如果你不喜歡那樣，看看nltk包。

來源

2010-10-18 21:29:40

python正則表達式問題

回答

相關問題