2016-08-21 121 views
1

我一直在試圖抓取www.ratemyprofessors.com,我需要點擊「加載更多」按鈕來刮掉我需要的所有數據。然而,即時通訊使用的是現在的代碼是不工作dryscrape點擊「加載更多按鈕」

loadButton = session.at_xpath(path) 
    loadButton.click() 

的路徑是絕對正確的,因爲loadButton.text()等於「負載更多」,但它給了我一個錯誤說基本上是「無法點擊,因爲重疊元素」 。

有誰知道如何解決這個問題或解決方法?從我一直在閱讀的內容中,我們也可以模擬JavaScript在網絡選項卡中運行的功能。但是我有一些很難找到,因爲的onclick功能不直接調用的函數,而是

onclick="javascript:mtvn.btg.Controller.sendLinkEvent({ linkName:\'PROFMIDPANE:LoadMore\', linkType:\'o\' }); 

順便說一句,我使用Python和「加載更多」按鈕教授列表下位於左側之後你執行搜索學校

我一直在閱讀一些相關的職位,但還沒有來

任何幫助,將不勝感激有益的交叉的東西!

my network/params tab

回答

0

可以使用requestsbs4,做這一切的時候,你點擊加載更多按鈕的請求時:

enter image description here

所以一旦你有一個頁面,你可以得到所有的老師和評級均在json格式下使用網址http://www.ratemyprofessors.com/ShowRatings.jsp?tid=881718

import requests 
from bs4 import BeautifulSoup 


params = {"solrformat": "true", 
      "rows": "1000", # set it high number to always get all rows. 
      "q": "", 
      "defType": "edismax", 
      "qf": "teacherfullname_t^1000 autosuggest", 
      "bf": "pow(total_number_of_ratings_i,2.1)", 
      "sort": "total_number_of_ratings_i desc", 
      "siteName": "rmp", 
      "fl": "pk_id teacherfirstname_t teacherlastname_t total_number_of_ratings_i averageratingscore_rf schoolid_s"} 

url = "http://search.mtvnservices.com/typeahead/suggest/" 
query = '*:* AND schoolid_s:{id} AND teacherdepartment_s:"{subject}"' 
with requests.Session() as s: 
    s.headers.update({"User-Agent": "Mozilla/5.0 (X11; Linux x86_64)"}) 
    soup = BeautifulSoup(s.get("http://www.ratemyprofessors.com/ShowRatings.jsp?tid=881718").content) 
    # pass the school id which we can parse from the page. 
    params["q"] = query.format(id=soup.select_one("[data-schoolid]")["data-schoolid"], subject="History") 
    res = s.get(url, params=params) 

    json_data = res.json() 


from pprint import pprint as pp 
pp(json_data["response"]["docs"]) 

給我們:

[{u'averageratingscore_rf': 4.6, 
    u'pk_id': 1347824, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'JP', 
    u'teacherlastname_t': u'Godwin', 
    u'total_number_of_ratings_i': 88}, 
{u'averageratingscore_rf': 3.38, 
    u'pk_id': 692471, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'James', 
    u'teacherlastname_t': u'Page', 
    u'total_number_of_ratings_i': 49}, 
{u'averageratingscore_rf': 3.5, 
    u'pk_id': 555487, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Kevin', 
    u'teacherlastname_t': u'Davis', 
    u'total_number_of_ratings_i': 44}, 
{u'averageratingscore_rf': 4.4, 
    u'pk_id': 1289399, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Jane', 
    u'teacherlastname_t': u'England', 
    u'total_number_of_ratings_i': 33}, 
{u'averageratingscore_rf': 3.46, 
    u'pk_id': 1230841, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Simone', 
    u'teacherlastname_t': u'De Santiago Ramos', 
    u'total_number_of_ratings_i': 24}, 
{u'averageratingscore_rf': 3.15, 
    u'pk_id': 701257, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Jack', 
    u'teacherlastname_t': u'Pyle', 
    u'total_number_of_ratings_i': 23}, 
{u'averageratingscore_rf': 4.13, 
    u'pk_id': 1466455, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Chris', 
    u'teacherlastname_t': u'Politz', 
    u'total_number_of_ratings_i': 20}, 
{u'averageratingscore_rf': 4.67, 
    u'pk_id': 1218949, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'James', 
    u'teacherlastname_t': u'Hathcock', 
    u'total_number_of_ratings_i': 18}, 
{u'averageratingscore_rf': 3.93, 
    u'pk_id': 1648329, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Joshua', 
    u'teacherlastname_t': u'Montandon', 
    u'total_number_of_ratings_i': 15}, 
{u'averageratingscore_rf': 2.79, 
    u'pk_id': 1543864, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'M', 
    u'teacherlastname_t': u'Antle', 
    u'total_number_of_ratings_i': 14}, 
{u'averageratingscore_rf': 3.83, 
    u'pk_id': 1096585, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Scotty', 
    u'teacherlastname_t': u'Edler', 
    u'total_number_of_ratings_i': 12}, 
{u'averageratingscore_rf': 3.92, 
    u'pk_id': 1260089, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'James', 
    u'teacherlastname_t': u'Reynolds', 
    u'total_number_of_ratings_i': 12}, 
{u'averageratingscore_rf': 4.42, 
    u'pk_id': 1418409, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Steve', 
    u'teacherlastname_t': u'Wolfrum', 
    u'total_number_of_ratings_i': 12}, 
{u'averageratingscore_rf': 4.45, 
    u'pk_id': 899881, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Karen', 
    u'teacherlastname_t': u'Stewart', 
    u'total_number_of_ratings_i': 11}, 
{u'averageratingscore_rf': 3.2, 
    u'pk_id': 592508, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Crystal', 
    u'teacherlastname_t': u'Wright', 
    u'total_number_of_ratings_i': 10}, 
{u'averageratingscore_rf': 4.5, 
    u'pk_id': 891457, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Lisa', 
    u'teacherlastname_t': u'Morales', 
    u'total_number_of_ratings_i': 10}, 
{u'averageratingscore_rf': 2.9, 
    u'pk_id': 1329058, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Mark', 
    u'teacherlastname_t': u'Thompson', 
    u'total_number_of_ratings_i': 10}, 
{u'averageratingscore_rf': 4.0, 
    u'pk_id': 1339373, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Charles', 
    u'teacherlastname_t': u'Williams', 
    u'total_number_of_ratings_i': 10}, 
{u'averageratingscore_rf': 4.5, 
    u'pk_id': 1587880, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Noelle', 
    u'teacherlastname_t': u'Depperschmidt', 
    u'total_number_of_ratings_i': 10}, 
{u'averageratingscore_rf': 4.39, 
    u'pk_id': 1426470, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Adrien', 
    u'teacherlastname_t': u'Ivan', 
    u'total_number_of_ratings_i': 9}, 
{u'averageratingscore_rf': 5.0, 
    u'pk_id': 1871677, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Kevin', 
    u'teacherlastname_t': u'Eades', 
    u'total_number_of_ratings_i': 9}, 
{u'averageratingscore_rf': 4.81, 
    u'pk_id': 393151, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Sharon', 
    u'teacherlastname_t': u'Romero', 
    u'total_number_of_ratings_i': 8}, 
{u'averageratingscore_rf': 3.69, 
    u'pk_id': 1377603, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Joseph', 
    u'teacherlastname_t': u'Ialenti', 
    u'total_number_of_ratings_i': 8}, 
{u'averageratingscore_rf': 3.43, 
    u'pk_id': 1752608, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'James', 
    u'teacherlastname_t': u'Jones', 
    u'total_number_of_ratings_i': 7}, 
{u'averageratingscore_rf': 3.43, 
    u'pk_id': 1782369, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Sara', 
    u'teacherlastname_t': u'Ruppel', 
    u'total_number_of_ratings_i': 7}, 
{u'averageratingscore_rf': 3.33, 
    u'pk_id': 1096000, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Scott', 
    u'teacherlastname_t': u'Harp', 
    u'total_number_of_ratings_i': 6}, 
{u'averageratingscore_rf': 2.17, 
    u'pk_id': 2061535, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'David', 
    u'teacherlastname_t': u'Powell', 
    u'total_number_of_ratings_i': 6}, 
{u'averageratingscore_rf': 4.1, 
    u'pk_id': 556560, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'', 
    u'teacherlastname_t': u'English', 
    u'total_number_of_ratings_i': 5}, 
{u'averageratingscore_rf': 3.9, 
    u'pk_id': 2032232, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Robin', 
    u'teacherlastname_t': u'Jett', 
    u'total_number_of_ratings_i': 5}, 
{u'averageratingscore_rf': 3.3, 
    u'pk_id': 1242893, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Dennis', 
    u'teacherlastname_t': u'Spillman', 
    u'total_number_of_ratings_i': 5}, 
{u'averageratingscore_rf': 5.0, 
    u'pk_id': 1209837, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Jared', 
    u'teacherlastname_t': u'Sutton', 
    u'total_number_of_ratings_i': 4}, 
{u'averageratingscore_rf': 3.38, 
    u'pk_id': 1587886, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Arianna', 
    u'teacherlastname_t': u'Warren', 
    u'total_number_of_ratings_i': 4}, 
{u'averageratingscore_rf': 4.4, 
    u'pk_id': 1643053, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Kimberly', 
    u'teacherlastname_t': u'Lacoco', 
    u'total_number_of_ratings_i': 4}, 
{u'averageratingscore_rf': 2.5, 
    u'pk_id': 1857299, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Kevin', 
    u'teacherlastname_t': u'Pyle', 
    u'total_number_of_ratings_i': 4}, 
{u'averageratingscore_rf': 2.33, 
    u'pk_id': 892723, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Keith', 
    u'teacherlastname_t': u'Mitchener', 
    u'total_number_of_ratings_i': 3}, 
{u'averageratingscore_rf': 3.5, 
    u'pk_id': 1448008, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Sally', 
    u'teacherlastname_t': u'Stratso', 
    u'total_number_of_ratings_i': 3}, 
{u'averageratingscore_rf': 3.25, 
    u'pk_id': 680381, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Todd', 
    u'teacherlastname_t': u'Venable', 
    u'total_number_of_ratings_i': 2}, 
{u'averageratingscore_rf': 5.0, 
    u'pk_id': 1256069, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Amanda', 
    u'teacherlastname_t': u'Campbell-Wyatt', 
    u'total_number_of_ratings_i': 2}, 
{u'averageratingscore_rf': 5.0, 
    u'pk_id': 2142326, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Jeremy', 
    u'teacherlastname_t': u'Godwin', 
    u'total_number_of_ratings_i': 2}, 
{u'averageratingscore_rf': 1.5, 
    u'pk_id': 697421, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Woody', 
    u'teacherlastname_t': u'Paige', 
    u'total_number_of_ratings_i': 1}, 
{u'averageratingscore_rf': 1.0, 
    u'pk_id': 881718, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'M', 
    u'teacherlastname_t': u'Sullivan', 
    u'total_number_of_ratings_i': 1}, 
{u'averageratingscore_rf': 1.5, 
    u'pk_id': 1607181, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Nancy', 
    u'teacherlastname_t': u'Coffelt', 
    u'total_number_of_ratings_i': 1}, 
{u'averageratingscore_rf': 5.0, 
    u'pk_id': 1710114, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Jason', 
    u'teacherlastname_t': u'Scheller', 
    u'total_number_of_ratings_i': 1}, 
{u'averageratingscore_rf': 4.0, 
    u'pk_id': 2164391, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'James', 
    u'teacherlastname_t': u'Paige', 
    u'total_number_of_ratings_i': 1}, 
{u'pk_id': 2083511, 
    u'schoolid_s': u'4873', 
    u'teacherfirstname_t': u'Stephen ', 
    u'teacherlastname_t': u'Wolfrum', 
    u'total_number_of_ratings_i': 0}] 

所有你需要這樣是經過學校id和受查詢字符串,你可以得到任何你喜歡的。

+0

感謝您的詳細解答!你能否詳細解釋一下你如何點擊你發佈的請求圖片的「加載更多」按鈕? '我猜測 'res = s.get(url,params = params)'負責模擬點擊按鈕,所以你能解釋你是如何根據你發佈的請求圖像得到參數的嗎? –

+0

@ b.g,是的,轉到開發人員工具,在JS選項卡下查找,然後單擊加載更多,每當您執行請求時,您會看到上面看到的內容被觸發。如果你通過'subject =「Music」'等等,你可以看到其他的結果。 –

+0

我能夠找到按鈕的請求,但我不知道如何使用它在我的網絡選項卡中使用它 我有不同的參數比您在代碼中使用的參數(請參閱我的編輯爲我的網絡/ params選項卡)。所以我只是想知道你是如何得到你的參數,並用它來加載更多的教授 –