如何僅爲擁有它的頁面獲取下一頁的結果？

所以這段代碼讓我獲得了所有的比賽結果，一隊vs隊和比賽的比分。例如像這樣的團隊http://www.gosugamers.net/counterstrike/teams/7395-mousesports-cs/matches。但是，這段代碼只能獲得第一頁的結果，我試圖獲得每個可用頁面的所有結果。問題是有些團隊沒有下一頁按鈕，所以當我試圖實現該代碼時，程序崩潰。我如何編寫代碼來獲取下一頁並繼續獲得結果，如果團隊匹配鏈接沒有下一頁，那麼只需繼續？如何僅爲擁有它的頁面獲取下一頁的結果？

def all_match_outcomes(): 
    for match_outcomes in match_history_url(): 
     rest_server(True) 
     page = requests.get(match_outcomes).content 
     soup = BeautifulSoup(page, 'html.parser') 

     team_name_element = soup.select_one('div.teamNameHolder') 
     team_name = team_name_element.find('h1').text.replace('- Team Overview', '') 

     for match_outcome in soup.select('table.simple.gamelist.profilelist tr'): 
      opp1 = match_outcome.find('span', {'class': 'opp1'}).text 
      opp2 = match_outcome.find('span', {'class': 'opp2'}).text 

      opp1_score = match_outcome.find('span', {'class': 'hscore'}).text 
      opp2_score = match_outcome.find('span', {'class': 'ascore'}).text 

      if match_outcome(True): # If teams have past matches 
       print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2))

來源

2016-06-21 DJRodrigue

什麼是沒有下一個按鈕的例子？你是在談論頁面末尾的下一個按鈕還是什麼？ –

所以在底部的鏈接上會顯示頁面數量，然後顯示下一頁或最後一頁。有些球隊根本沒有這個功能，因爲他們沒有太多的比賽或任何可能的情況。因此，如果我將代碼合併到下一頁，它會崩潰，並說該頁面不包含所述標籤或用於查找下一頁的內容。 – DJRodrigue

在for循環將遊戲得分從表中拉出後，您可以獲取分頁鏈接。

使用此代碼，您可以通過查找當前選擇的頁面來獲取下一頁。如果沒有超出當前選定的頁面（當前）將打印「找不到頁面」。

paginate = soup.find('div', {'class':'paginator'}) 

page = paginate.find('a', {'class':'selected'}) 

next_page = page.find_next_sibling() 
if next_page: 
    print(next_page.get('href')) 
else: 
    print("no page found")

編輯

響應於該評論;這就是我在想如何使用這些代碼。然後它會被添加，你可以繼續循環。

def all_match_outcomes(): 
    for match_outcomes in match_history_url(): 
     rest_server(True) 
     page = requests.get(match_outcomes).content 
     soup = BeautifulSoup(page, 'html.parser') 

     team_name_element = soup.select_one('div.teamNameHolder') 
     team_name = team_name_element.find('h1').text.replace('- Team Overview', '') 

     for match_outcome in soup.select('table.simple.gamelist.profilelist tr'): 
      opp1 = match_outcome.find('span', {'class': 'opp1'}).text 
      opp2 = match_outcome.find('span', {'class': 'opp2'}).text 

      opp1_score = match_outcome.find('span', {'class': 'hscore'}).text 
      opp2_score = match_outcome.find('span', {'class': 'ascore'}).text 

      if match_outcome(True): # If teams have past matches 
       print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2)) 
     # get the next page if there is one here 
     page = paginate.find('a', {'class':'selected'}) 
     if page: 
      next_page = page.find_next_sibling() 
      if next_page: 
       print(next_page.get('href')) 
       # just append this to a list or add it to whatever you use to 
       # track the next url to crawl 
       next_url = next_page.get('href')

來源

2016-06-21 22:38:20 bmcculley

好吧，那麼我在if語句中添加其他函數嗎？如果頁面沒有下一頁如何運行我的代碼的其餘部分，那麼它會不會崩潰並獲取我需要的信息？ – DJRodrigue

我在想你可以將它添加到你在這裏發佈的代碼中，但是我不確定你的'match_history_url'函數是什麼樣的？ – bmcculley

它只是爲每個團隊循環鏈接，它包含的示例鏈接就像我爲示例發佈的鏈接。它擁有所有的團隊匹配url頁面 – DJRodrigue

如何僅爲擁有它的頁面獲取下一頁的結果？

回答

相關問題