2017-09-23 161 views
0

試圖找出如何使用if語句,在該語句中,我可以將三到四個單詞分組,以便從CSV文件中省略。在代碼底部,您會看到我卡在:if ('reddit', 'passwords') not in x:python腳本 - 將單詞分組爲If-Not語句

任何幫助都會很棒。

# import libraries 
import bs4 
from urllib2 import urlopen as uReq 
from bs4 import BeautifulSoup as soup 

my_url = 'https://www.reddit.com/r/NHLStreams/comments/71uhwi/game_thread_sabres_at_maple_leafs_730_pm_et/' 

# opening up connection, grabbing the page 
uClient = uReq(my_url) 
page_html = uClient.read() 
uClient.close() 

# html parsing 
page_soup = soup(page_html, "html.parser") 


filename = "sportstreams.csv" 
f = open(filename, "w") 
headers = "Sport Links " + "\n" 
f.write(headers) 

links = page_soup.select("form a[href]") 
for link in links: 
    href = link["href"] 
    print(href) 

    f.write(href + "\n") 



with open('sportstreams.csv') as f,open('sstream.csv', "w") as f2: 
    for x in f: 
     if ('reddit', 'passwords') not in x: # trying to find multi words to omit 
      f2.write(x.strip()+'\n') 
+0

目前還不清楚你想要什麼'如果(...)不在x'中做。所有的元素都必須從'x'中丟失,或者它們中的任何一個足以觸發'if'? –

+0

我想我的代碼很弱,因爲我試圖用任何包含單詞「reddit」「/ r /」和「/ password」的行來簡化我的結果以省略。這將縮短我的鏈接列表,這對我來說是成功的。 :) –

+1

請編輯您的問題的解釋,使其完成。如果你能夠展示你想要忽略的行與你想要保留的行的具體例子,那將是很好的。 –

回答

1

使用內置函數all

if all(t not in x for t in ('reddit', 'passwords')): 

或者any

if not any(t in x for t in ('reddit', 'passwords')): 

這是它是在你的情況管理器:

with open('sportstreams.csv') as f, open('sstream.csv', "w") as f2: 
    for line in f: 
     if any(t in line for t in ('reddit', 'passwords')): 
      # The line contains one of the strings. 
      continue 
     else: 
      # The line contains none of the strings. 
      f2.write(line.strip() + '\n') 
+0

這不是我的朋友。我可能做錯了什麼?我用你寫的東西替換了'if('reddit','passwords')不在x:'中。不要忽略包含reddit或密碼的行。 :( –

+1

@JamesDean這部分是你自己的錯,你的問題中的規格是非常不清楚的 –

+0

我很抱歉,我應該怎樣做,以忽略任何包含這些元素的行(reddit,/ r /,/ password)? –