Python - 檢查單詞是否在字符串中

210

什麼是錯的：

if word in mystring: 
    print 'success'

來源

2011-03-16 01:13:09 fabrizioM

+54

作爲一個謹慎，如果你有一個字符串「副傷寒是不好的」，你做了一個如果「傷寒」在「副傷寒是壞」你會得到一個tr UE。 – 2012-12-19 17:52:08

+1

任何人都知道如何克服這個問題？ – user2567857 2014-08-19 09:36:16

+3

@ user2567857，正則表達式 - 請參閱Hugh Bothwell的答案。 – 2014-08-21 19:23:08

13

發現返回一個代表，其中搜索項目被發現的指數的整數。如果找不到，則返回-1。

haystack = 'asdf' 

haystack.find('a') # result: 0 
haystack.find('s') # result: 1 
haystack.find('g') # result: -1 

if haystack.find(needle) >= 0: 
    print 'Needle found.' 
else: 
    print 'Needle not found.'

來源

2011-03-16 01:13:14

109

if 'seek' in 'those who seek shall find': 
    print('Success!')

，但請記住，這個字符序列，未必全字匹配 - 例如，'word' in 'swordsmith'爲True。如果你只想匹配整個單詞，你應該使用正則表達式：

import re 

def findWholeWord(w): 
    return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search 

findWholeWord('seek')('those who seek shall find') # -> <match object> 
findWholeWord('word')('swordsmith')     # -> None

來源

2011-03-16 01:52:56

+6

您也可以在搜索的人中搜索「if」，但僅在該詞有空格時才起作用：P – Deviljho 2014-02-27 19:20:07

+0

是否有一種非常快速的方法搜索多個單詞，說一組幾千個單詞，而不必爲每個單詞構造一個for循環？我有一百萬個句子，還有一百萬個詞條來搜索，看看哪個句子有哪些匹配詞。目前需要幾天的時間來處理，而且我想知道是否有更快的方法。 – Tom 2016-12-27 19:49:39

+0

@Tom嘗試使用grep代替Python正則表達式 – 2017-02-03 22:57:14

6

如果匹配的字符序列是不夠的，你需要匹配整個單詞，這裏是一個簡單的函數，能夠完成任務。它基本上附加的空間在必要和搜索的字符串：

def smart_find(haystack, needle): 
    if haystack.startswith(needle+" "): 
     return True 
    if haystack.endswith(" "+needle): 
     return True 
    if haystack.find(" "+needle+" ") != -1: 
     return True 
    return False

這假定逗號和其他標點符號已經被剝離出來。

來源

2012-06-15 07:23:00 daSong

+0

這種解決方案對我的情況最有效，因爲我使用的是標記空格分隔的字符串。 – Avijit 2016-01-04 05:05:47

9

這個小函數比較給定文本中的所有搜索詞。如果在文本中找到所有搜索詞，則返回搜索長度，否則返回False。

還支持unicode字符串搜索。

def find_words(text, search): 
    """Find exact words""" 
    dText = text.split() 
    dSearch = search.split() 

    found_word = 0 

    for text_word in dText: 
     for search_word in dSearch: 
      if search_word == text_word: 
       found_word += 1 

    if found_word == len(dSearch): 
     return lenSearch 
    else: 
     return False

用法：

find_words('çelik güray ankara', 'güray ankara')

來源

2012-06-22 22:51:30

1

你可能只是前後「字」添加一個空格。

x = raw_input("Type your word: ") 
if " word " in x: 
    print "Yes" 
elif " word " not in x: 
    print "Nope"

這樣它就會查找「word」前後的空格。

>>> Type your word: Swordsmith 
>>> Nope 
>>> Type your word: word 
>>> Yes

來源

2015-02-26 14:23:44 PyGuy

+0

但是如果這個詞在句子的開頭或結尾（沒有空格）怎麼辦？ – MikeL 2016-12-13 11:25:14

12

如果你想找出一個完整的單詞是否在單詞的空格分隔的列表，只需使用：

def contains_word(s, w): 
    return (' ' + w + ' ') in (' ' + s + ' ') 

contains_word('the quick brown fox', 'brown') # True 
contains_word('the quick brown fox', 'row') # False

這種巧妙的方法也是最快的。相較於休·博思韋爾的和大鬆的方法：

>python -m timeit -s "def contains_word(s, w): return (' ' + w + ' ') in (' ' + s + ' ')" "contains_word('the quick brown fox', 'brown')" 
1000000 loops, best of 3: 0.351 usec per loop 

>python -m timeit -s "import re" -s "def contains_word(s, w): return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search(s)" "contains_word('the quick brown fox', 'brown')" 
100000 loops, best of 3: 2.38 usec per loop 

>python -m timeit -s "def contains_word(s, w): return s.startswith(w + ' ') or s.endswith(' ' + w) or s.find(' ' + w + ' ') != -1" "contains_word('the quick brown fox', 'brown')" 
1000000 loops, best of 3: 1.13 usec per loop

來源

2016-04-11 20:32:33 user200783

+1

這是我最喜歡的答案:) – IanS 2016-08-11 13:16:21

+0

我同意，但最快的解決方案不會忽略像re.compile這樣的情況（... – 2016-09-19 20:31:27

+0

這有幾個問題：（1）末尾的單詞（2）開頭的單詞（3）中間的單詞like'contains_word（「says」，「Simon說：不要用這個回答」） – 2017-08-09 09:53:21

2

檢查確切的詞，我們需要在一個很長的字符串找到高級方式：

import re 
text = "This text was of edited by Rock" 
#try this string also 
#text = "This text was officially edited by Rock" 
for m in re.finditer(r"\bof\b", text): 
    if m.group(0): 
     print "Present" 
    else: 
     print "Absent"

來源

2016-11-02 08:39:22 Rameez

5

您可以分割字符串的話，並檢查結果列表。

if word in string.split(): 
    print 'success'

來源

2016-12-01 18:26:47 Corvax

+2

請使用[編輯]鏈接說明此代碼的工作原理，不要只給代碼，因爲解釋更有可能幫助未來的讀者。 – 2016-12-01 19:55:28

+0

這應該是匹配整個單詞的實際答案。 – 2017-06-16 19:52:22

1

正如你所要求的一個詞，而不是一個字符串，我想提出一個解決方案，這對前綴/後綴敏感而忽略大小寫：

#!/usr/bin/env python 

import re 


def is_word_in_text(word, text): 
    """ 
    Check if a word is in a text. 

    Parameters 
    ---------- 
    word : str 
    text : str 

    Returns 
    ------- 
    bool : True if word is in text, otherwise False. 

    Examples 
    -------- 
    >>> is_word_in_text("Python", "python is awesome.") 
    True 

    >>> is_word_in_text("Python", "camelCase is pythonic.") 
    False 

    >>> is_word_in_text("Python", "At the end is Python") 
    True 
    """ 
    pattern = r'(^|[^\w]){}([^\w]|$)'.format(word) 
    pattern = re.compile(pattern, re.IGNORECASE) 
    matches = re.search(pattern, text) 
    return bool(matches) 


if __name__ == '__main__': 
    import doctest 
    doctest.testmod()

如果你的話會包含正則表達式特殊字符（如+），那麼你需要re.escape(word)

來源

2017-08-09 10:11:57

0

使用正則表達式是一般的解決方案，但它是複雜的情況下。

您可以簡單地將文本分割成單詞列表。使用拆分（分離器,號碼）方法。它返回字符串中所有單詞的列表，使用分隔符作爲分隔符。如果分離是不確定的分裂它所有的空格（可選，你可以限制分割的數量來NUM）。

list_of_words = mystring.split() 
if word in list_of_words: 
    print 'success'

這不會對字符串用逗號等。例如工作：

mystring = "One,two and three" 
# will split into ["One,two", "and", "three"]

如果你也想拆就所有逗號等使用分離的說法是這樣的：

# whitespace_chars = " \t\n\r\f" - space, tab, newline, return, formfeed 
list_of_words = mystring.split(\t\n\r\f,.;!?'\"()") 
if word in list_of_words: 
    print 'success'

來源

2017-12-18 11:44:51 tstempko

Python - 檢查單詞是否在字符串中

回答

相關問題