Python3，Selenium，BeautifulSoup4堆棧不會從網站加載更多信息

我想從德國的網站上獲取一些信息。由於此網站通過點擊網站底部的向下箭頭加載更多內容，我認爲我應該使用selenium來實現加載過程。之後，腳本應通過BeautifulSoup獲取所需信息並將其提取到CSV文件。Python3，Selenium，BeautifulSoup4堆棧不會從網站加載更多信息

不幸的是我的腳本似乎沒有點擊所需的按鈕，所以我只收到第一部分信息。

我的代碼如下：

import csv 
import requests 
from bs4 import BeautifulSoup 
from selenium import webdriver 
from selenium.webdriver.common.keys import Keys 


with open('shoop.csv','w', encoding='utf-8') as csv_file: 
    csv_writer = csv.writer(csv_file, delimiter=";") 
    csv_writer.writerow(['Headline', 'Cashback']) 
    driver = webdriver.Firefox() 
    driver.get('https://www.shoop.de/stoebern/haus_technik/3/popular/') 
    driver.find_element_by_class_name('icon-down_open_big').click() 
    r = driver.page_source 

    driver.quit() 
    soup = BeautifulSoup(r) 
    for advertiser in soup.find_all('div', {'class': 'merchant_item'}):  
     headline = advertiser.find('h3', {'class':'merchant_name'}).text 
     cashback = advertiser.find('span', {'class':'rates_number'}).text 
     liste = ([headline, cashback]) 
     print(liste) 
     csv_writer.writerow(liste) 
csv_file.close()

來源

2017-02-19 Franz Luxemburger

THX列娃，救了我的一天。如果在我的代碼中有一點額外的睡眠時間，那麼導出到csv會按預期工作。 –

似乎有要在該網站上大量的JavaScript。也許箭頭只在用戶向下滾動到某個程度時纔會出現。當我將滾動到你的代碼，箭頭按下成功

滾動中硒的網頁是通過執行腳本來完成：

# Whenever you want to press the arrow, scroll down with this line 
driver.execute_script('window.scrollTo(0, document.body.scrollHeight);') 
driver.find_element_by_class_name('icon-down_open_big').click()

來源

2017-02-19 06:03:43 Leva7

Python3，Selenium，BeautifulSoup4堆棧不會從網站加載更多信息

回答

相關問題