另一個字符串中包含的字符串的一部分正則表達式python

有沒有辦法檢查一個字符串的任何部分是否與python中的另一個字符串匹配？另一個字符串中包含的字符串的一部分正則表達式python

對於例如爲：我的URL看起來像這樣

url = pd.DataFrame({'urls' : ['www.amazon.com/ANASTASIA-Beverly...Brow/dp/B00GI21NZA', 'www.ulta.com/beautyservices/benefitbrowbar/']})

，我有一個字符串看起來像：

string_list = ['Benefit Cosmetics', 'Anastasia Beverly Hills'] 
string = '|'.join(string_list)

我想與url匹配string。

Anastasia Beverly Hills與www.amazon.com/ANASTASIA-Beverly...Brow/dp/B00GI21NZA和

www.ulta.com/beautyservices/benefitbrowbar/與Benefit Cosmetics。

我一直在嘗試url['urls'].str.contains('('+string+')', case = False)但這並不符合。

什麼是正確的方法來做到這一點？

來源

2016-12-14 vagabond

結帳：http://www.pythontutor.com/visualize.html#mode=edit –

我不能做到這一點作爲一個行正則表達式，但這裏是使用itertools任何企圖我：

import pandas as pd 
from itertools import product 

url = pd.DataFrame({'urls' : ['www.amazon.com/ANASTASIA-Beverly...Brow/dp/B00GI21NZA', 'www.ulta.com/beautyservices/benefitbrowbar/']}) 
string_list = ['Benefit Cosmetics', 'Anastasia Beverly Hills'] 

""" 
For each of Cartesian product (the different combinations) of 
string_list and urls. 
""" 
for x in list(product(string_list, url['urls'])): 
    """ 
    If any of the words in the string (x[0]) are present in 
    the URL (x[1]) disregarding case. 
    """ 
    if any (word.lower() in x[1].lower() for word in x[0].split()): 
     """ 
     Show the match. 
     """ 
     print ("Match String: %s URL: %s" % (x[0], x[1]))

輸出：

Match String: Benefit Cosmetics URL: www.ulta.com/beautyservices/benefitbrowbar/ 
Match String: Anastasia Beverly Hills URL: www.amazon.com/ANASTASIA-Beverly...Brow/dp/B00GI21NZA

更新時間：

你在看它的方式可以選擇使用：

import pandas as pd 
import warnings 
pd.set_option('display.width', 100) 
""" 
Supress the warning it will give on a match. 
""" 
warnings.filterwarnings("ignore", 'This pattern has match groups') 
string_list = ['Benefit Cosmetics', 'Anastasia Beverly Hills'] 
""" 
Create a pandas DataFrame. 
""" 
url = pd.DataFrame({'urls' : ['www.amazon.com/ANASTASIA-Beverly...Brow/dp/B00GI21NZA', 'www.ulta.com/beautyservices/benefitbrowbar/']}) 
""" 
Using one string at a time. 
""" 
for string in string_list: 
    """ 
    Get the individual words in the string and concatenate them 
    using a pipe to create a regex pattern. 
    """ 
    s = "|".join(string.split()) 
    """ 
    Update the DataFrame with True or False where the regex 
    matches the URL. 
    """ 
    url[string] = url['urls'].str.contains('('+s+')', case = False) 
""" 
Show the result 
""" 
print (url)

這將輸出：

           urls Benefit Cosmetics Anastasia Beverly Hills 
0 www.amazon.com/ANASTASIA-Beverly...Brow/dp/B00...    False     True 
1  www.ulta.com/beautyservices/benefitbrowbar/    True     False

我猜，如果你想在一個數據幀，可能會更好，但我更喜歡第一種方式。

來源

2016-12-14 23:39:08

另一個字符串中包含的字符串的一部分正則表達式python

回答

相關問題