2016-06-21 40 views
1

所以這段代碼讓我獲得了所有的比賽結果,一隊vs隊和比賽的比分。例如像這樣的團隊http://www.gosugamers.net/counterstrike/teams/7395-mousesports-cs/matches。但是,這段代碼只能獲得第一頁的結果,我試圖獲得每個可用頁面的所有結果。問題是有些團隊沒有下一頁按鈕,所以當我試圖實現該代碼時,程序崩潰。我如何編寫代碼來獲取下一頁並繼續獲得結果,如果團隊匹配鏈接沒有下一頁,那麼只需繼續?如何僅爲擁有它的頁面獲取下一頁的結果?

def all_match_outcomes(): 
    for match_outcomes in match_history_url(): 
     rest_server(True) 
     page = requests.get(match_outcomes).content 
     soup = BeautifulSoup(page, 'html.parser') 

     team_name_element = soup.select_one('div.teamNameHolder') 
     team_name = team_name_element.find('h1').text.replace('- Team Overview', '') 

     for match_outcome in soup.select('table.simple.gamelist.profilelist tr'): 
      opp1 = match_outcome.find('span', {'class': 'opp1'}).text 
      opp2 = match_outcome.find('span', {'class': 'opp2'}).text 

      opp1_score = match_outcome.find('span', {'class': 'hscore'}).text 
      opp2_score = match_outcome.find('span', {'class': 'ascore'}).text 

      if match_outcome(True): # If teams have past matches 
       print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2)) 
+0

什麼是沒有下一個按鈕的例子?你是在談論頁面末尾的下一個按鈕還是什麼? –

+0

所以在底部的鏈接上會顯示頁面數量,然後顯示下一頁或最後一頁。有些球隊根本沒有這個功能,因爲他們沒有太多的比賽或任何可能的情況。因此,如果我將代碼合併到下一頁,它會崩潰,並說該頁面不包含所述標籤或用於查找下一頁的內容。 – DJRodrigue

回答

0

for循環將遊戲得分從表中拉出後,您可以獲取分頁鏈接。

使用此代碼,您可以通過查找當前選擇的頁面來獲取下一頁。如果沒有超出當前選定的頁面(當前)將打印「找不到頁面」。

paginate = soup.find('div', {'class':'paginator'}) 

page = paginate.find('a', {'class':'selected'}) 

next_page = page.find_next_sibling() 
if next_page: 
    print(next_page.get('href')) 
else: 
    print("no page found") 

編輯

響應於該評論;這就是我在想如何使用這些代碼。然後它會被添加,你可以繼續循環。

def all_match_outcomes(): 
    for match_outcomes in match_history_url(): 
     rest_server(True) 
     page = requests.get(match_outcomes).content 
     soup = BeautifulSoup(page, 'html.parser') 

     team_name_element = soup.select_one('div.teamNameHolder') 
     team_name = team_name_element.find('h1').text.replace('- Team Overview', '') 

     for match_outcome in soup.select('table.simple.gamelist.profilelist tr'): 
      opp1 = match_outcome.find('span', {'class': 'opp1'}).text 
      opp2 = match_outcome.find('span', {'class': 'opp2'}).text 

      opp1_score = match_outcome.find('span', {'class': 'hscore'}).text 
      opp2_score = match_outcome.find('span', {'class': 'ascore'}).text 

      if match_outcome(True): # If teams have past matches 
       print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2)) 
     # get the next page if there is one here 
     page = paginate.find('a', {'class':'selected'}) 
     if page: 
      next_page = page.find_next_sibling() 
      if next_page: 
       print(next_page.get('href')) 
       # just append this to a list or add it to whatever you use to 
       # track the next url to crawl 
       next_url = next_page.get('href') 
+0

好吧,那麼我在if語句中添加其他函數嗎?如果頁面沒有下一頁如何運行我的代碼的其餘部分,那麼它會不會崩潰並獲取我需要的信息? – DJRodrigue

+0

我在想你可以將它添加到你在這裏發佈的代碼中,但是我不確定你的'match_history_url'函數是什麼樣的? – bmcculley

+0

它只是爲每個團隊循環鏈接,它包含的示例鏈接就像我爲示例發佈的鏈接。它擁有所有的團隊匹配url頁面 – DJRodrigue

相關問題