檢查Python列表項包含從另一個列表中某個字符串在另一字符串

-3

我有一個包含國家，像這樣的列表：檢查Python列表項包含從另一個列表中某個字符串在另一字符串

country = ["england","france","germany"]

我想用這個列表，並檢查這些值內的其它字符串列表，如：

urllist = ["http://uk.soccerway.com/matches/2017/02/22/germany/oberliga/tus-mechtersheim-1914/hertha-wiesbach/2300594/head2head/","http://uk.soccerway.com/matches/2017/02/22/india/u18-league/delhi-united-sc-u18/sudeva-u18/2397728/head2head/","http://uk.soccerway.com/matches/2017/02/22/england/championship/bristol-city-fc/fulham-football-club/2247116/head2head/"]

在urllist第二個值將被刪除，因爲它包含了價值印度，它不是在該國名單，給人的最終結果：

urllist = ["http://uk.soccerway.com/matches/2017/02/22/germany/oberliga/tus-mechtersheim-1914/hertha-wiesbach/2300594/head2head/","http://uk.soccerway.com/matches/2017/02/22/england/championship/bristol-city-fc/fulham-football-club/2247116/head2head/"]

來源

2017-02-22 Jarratt

所以這就是你想要做的 - 當你嘗試做什麼時，你面臨的問題到底是什麼？ – csmckelvey

你應該在這裏使用拆分功能，並且然後檢查是否允許在url中指定的國家/地區。

s = 'http://a/date/france/other' 
country = s.split('/')[4] #Adapt this to your case 
countries = ["england","france","germany"] 

interesting_urls = [url for url in urllist if url.split('/')[4] in countries]

這將避免（在）驗證一個國家，因爲印度的鏈接可能關注一個話題「英格蘭隊」。

來源

2017-02-22 18:16:20 cgte

這不會工作，如果該國出現在url的不同部分... – Nemo

當然，這假設URL這樣分開服從相同的模式。 – cgte

你可以用一個列表理解這樣做很容易：

urllist_new = set([url for url in urllist for cnty in country if cnty in url])

這相當於

urllist_new = [] 
for cnty in country: 
    for url in urllist: 
     if cnty in url: 
      urllist_new.append(url) 
urllist_new = set(urllist_new)

來源

2017-02-22 18:02:16 Nemo

魔鬼總是在細節中;如果某個網址與2個國家/地區相匹配（或者因爲您擁有我們和白俄羅斯或其他重疊國家/地區）或者網址與www.americanews.com/germany類似），並且您也無法處理案例 – Foon

我不是當然我遵循。這隻會在列表中引入重複項。儘管如此，我已經爲重複項添加了修復程序。 – Nemo

簡單列表解析會做到這一點：

output = [i for k in country for i in urllist if k in i]

來源

2017-02-22 18:03:35

可以使用成員運算符in查看字符串是否包含子字符串。因此，通過country循環，並檢查每個元素是否在urllist的每個網址中。

[url for c in country for url in urllist if c in url]

來源

2017-02-22 18:07:39 Trelzevir

在這裏它的另一個變化，但使用any這一次是在意圖更加清晰

>>> [url for url in urllist if any(c in url for c in country)] 
['http://uk.soccerway.com/matches/2017/02/22/germany/oberliga/tus-mechtersheim-1914/hertha-wiesbach/2300594/head2head/', 'http://uk.soccerway.com/matches/2017/02/22/england/championship/bristol-city-fc/fulham-football-club/2247116/head2head/'] 
>>>

，你也可以建立一個regular expression與re module使用，如果你想指定在URL中要符合國家表達的

>>> import re 
>>> exp=r"([^/]+/+){6}"+ "({})".format("|".join(country)) 
>>> exp 
'([^/]+/+){6}(england|france|germany)' 
>>> [ url for url in urllist if re.match(exp, url) ] 
['http://uk.soccerway.com/matches/2017/02/22/germany/oberliga/tus-mechtersheim-1914/hertha-wiesbach/2300594/head2head/', 'http://uk.soccerway.com/matches/2017/02/22/england/championship/bristol-city-fc/fulham-football-club/2247116/head2head/'] 
>>>

說明確切地點：

[^/]+意味着一切，這是不是一個/一次或多次
/+是一個或多個/
([^/]+/+){6}在這裏，我問了整整6組類似*/*/*/*/*/*/或在這種情況下*//*/*/*/*/*/
的休息應該是自我解釋

來源

2017-02-22 18:15:39 Copperfield

檢查Python列表項包含從另一個列表中某個字符串在另一字符串

回答

相關問題