2017-05-27 58 views
1

我想選擇只是列出的每個項目的銷售價格,但這是我能得到的最接近的。Ebay Webscraper

import requests 
from bs4 import BeautifulSoup 

url = 'http://www.ebay.co.uk/sch/i.html?_from=R40&_sacat=0&_nkw=graphics%20card&LH_Complete=1&LH_Sold=1&rt=nc&_trksid=p2045573.m1684' 
r = requests.get(url) 
soup = BeautifulSoup(r.content, 'html.parser') 
Sale_Price = [tag['class'] for tag in soup.find_all("span", class_="bold bidsold")] 
print(Sale_Price) 

這給了我: [ '大膽', 'bidsold'],[ '大膽', 'bidsold'],[ '大膽', 'bidsold'],[ '大膽', 「 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' 「bold」,「bidsold」],['bold','bidsold'],['bold','bidsold'],['bold','bidsold'],['bold' [bold,'bidsold']]

+1

這有什麼困惑?您選擇了您搜索的課程...嘗試添加除tag以外的內容['class']' –

回答

2

您正在儲存的名稱爲class。價格在string。使用get_text()得到string。這些字符串包括很多空格或新行,請使用strip()來消除這些字符。

import requests 
from bs4 import BeautifulSoup 

url = 'http://www.ebay.co.uk/sch/i.html?_from=R40&_sacat=0&_nkw=graphics%20card&LH_Complete=1&LH_Sold=1&rt=nc&_trksid=p2045573.m1684' 
r = requests.get(url) 
soup = BeautifulSoup(r.content, 'html.parser') 
Sale_Price =[ tag.get_text().strip() for tag in soup.find_all("span", class_="bold bidsold") ] 
print(Sale_Price) 

它給人的輸出:

['£159.99', '£240.00', '£8.00', '£100.00', '£54.99', '£324.99', '£10.00', '£130.00', '£21.00', '£68.00', '£25.00', '£90.00', '£210.00', '£269.49', '£90.56', '£5.90', '£56.00', '£89.99', '£142.00', '£104.00', '£35.00', '£8.80', '£27.00', '£45.00', '£45.00', '£115.11', '£293.19', '£172.00', '£42.00', '£14.39', '£120.00', '£24.99', '£11.73', '£10.50', '£88.00', '£340.00', '£136.82', '£5.00', '£21.32', '£66.46', '£49.99', '£25.00', '£30.00', '£385.00', '£258.00', '£64.30', '£87.00', '£29.99', '£77.99', '£36.88', '£71.00'] 

編輯
如果你想忽略£標誌然後採取字符串沒有第一個字符。

Sale_Price =[ tag.get_text().strip()[1:] for tag in soup.find_all("span", class_="bold bidsold") ] 
print(Sale_Price) 

這將只存儲沒有£標誌的價格。

+0

非常感謝。任何想法如何去除每個字符串的英鎊符號?我是一個初學者,而不是熟悉正則表達式,但是從我的搜索結果中,我搜集了're.sub('[£]','',line)]''涉及某種方式... –

+1

@JoshDiamond不需要' regex'。我已經更新了代碼,請看一下。 –

+0

@JoshDiamond你還有什麼問題嗎? –