這可能非常簡單,但我對python非常陌生,我根本找不到從哪裏開始。從Python中導出我的網絡抓取結果
因此,我編寫了一個代碼,成功地從網頁中抓取我想要的數據。現在我的問題是我不知道如何將它導出到csv,這是我的代碼的外觀。
import requests
import csv
from bs4 import BeautifulSoup
for numb in range(1, 3):
urls= "http://www.blocket.se/bostad/uthyres?cg_multi=3020&cg_multi=3100&cg_multi=3120&cg_multi=3060&cg_multi=3070&sort=&ss=&se=&ros=&roe=&bs=&be=&mre=&q=&q=&q=&save_search=1&l=0&md=th&o=" +str(numb) +"&f=p&f=c&f=b&ca=11&w=3"
r = requests.get(urls)
soup=BeautifulSoup(r.text, 'html.parser')
data = soup.find_all("div", {"itemtype": "http://schema.org/Offer"})
for item in data:
try:
print item.contents[3].find_all("span", {"class": "subject-param category"})[0].text
except:
pass
try:
print item.contents[3].find_all("span", {"class": "subject-param address separator"})[0].text
except:
pass
try:
print item.contents[3].find_all("span", {"class": "li_detail_params first rooms"})[0].text
except:
pass
try:
print item.contents[3].find_all("span", {"class": "li_detail_params monthly_rent"})[0].text
except:
pass
try:
print item.contents[3].find_all("span", {"class": "li_detail_params size"})[0].text
except:
pass
try:
print item.contents[3].find_all("span", {"class": "li_detail_params first weekly_rent_offseason"})[0].text
except:
pass
而且它打印此:
lägenhet
Stockholms stad - Bromma
1 rum
4 000 kr/mån
villa
Linköping
100 m²
lägenhet
Stockholms stad - Maria, Gamla Stan, Högalid
1 rum
8 000 kr/mån
36 m²
lägenhet
Stockholms stad - Hägersten, Liljeholmen
1 rum
7 500 kr/mån
26 m²
當然它不是最好的輸出,但我真的不關心這個。現在,有人可以指示我如何能夠將其導出到csv?正如我所說,我甚至不知道從哪裏開始。
甚至試過谷歌? – PascalVKooten 2015-04-02 21:31:16
不要抓住每一個例外,'除了:pass'永遠不是一個好主意 – 2015-04-02 21:37:06