0
打開一些參考我的網址https://cars.mail.ru/reviews/renault/?year=2010-2016
和我應該開在那裏的Python:從URL
https://cars.mail.ru/reviews/renault/sandero_stepway/2015/143355/
https://cars.mail.ru/reviews/renault/sandero/2015/147850/
https://cars.mail.ru/reviews/renault/sandero/2012/147529/
https://cars.mail.ru/reviews/renault/duster/2014/147433/
https://cars.mail.ru/reviews/renault/logan/2011/146991/
https://cars.mail.ru/reviews/renault/duster/2015/146645/
我需要打開所有的鏈接和旁邊的一個頁面,並有開放所有鏈接。 我該如何快速做到這一點? 如果我使用
models = ['11', '12', '14', '15', '16', '17', '18', '19', '20', '21', '25', '30', '4', '5', '6', '9',
'avantime', 'clio', 'clio_rs', 'duster', 'espace', 'estafette', 'express', 'fluence',
'fuego', 'grand_espace', 'grand_scenic', 'kangoo', 'kaptur', 'koleos', 'laguna', 'latitude',
'logan', 'mascott', 'master', 'megane', 'megane_rs', 'modus', 'safrane', 'sandero', 'sandero_stepway',
'scenic', 'symbol', 'trafic', 'twingo', 'vel_satis']
years = ['2010', '2011', '2012', '2013', '2014', '2015', '2016']
pattern = 'https://cars.mail.ru/reviews/renault/'
for model in models:
for year in years:
for i in range(143350, 143360):
res = pattern + model + '/' + year + '/' + str(i)
try:
page = urllib2.urlopen(res).read()
print page
soup = BeautifulSoup(page, 'html.parser')
except:
continue
它需要這麼多時間
它測試的例子。我有'我在範圍內(0,150000):'在實際數據中 –
更糟。這是483,00,000個請求。 Python代碼無法修復,您將不得不減少請求的數量。 – 2016-11-28 13:13:19