2016-07-08 76 views
0

我一直在嘗試過去3個小時來抓取這個website並獲得每個團隊的排名,名稱,勝利和損失。字節對象沒有屬性find_all

在實現此代碼:

import requests 
from bs4 import BeautifulSoup 

halo = requests.get("https://www.halowaypoint.com/en-us/esports/standings") 

page = BeautifulSoup(halo.content, "html.parser") 

final = page.encode('utf-8') 

print(final.find_all("div")) 

我不斷收到這個error

如果有人可以幫助我走出那將不勝感激!

謝謝!

回答

1

要調用了錯誤的變量的方法,使用BeautifulSoup對象字節串最後

print(page.find_all("div")) 

爲了讓表中的數據是非常簡單的,所有的數據是div內與CSS類「table.table - HCS」

halo = requests.get("https://www.halowaypoint.com/en-us/esports/standings") 

page = BeautifulSoup(halo.content, "html.parser") 


table = page.select_one("div.table.table--hcs") 
print(",".join([td.text for td in table.select("header div.td")])) 
for row in table.select("div.tr"): 
    rank,team = row.select_one("span.numeric--medium.hcs-trend-neutral").text,row.select_one("div.td.hcs-title").span.a.text 
    wins, losses = [div.span.text for div in row.select("div.td.em-7")] 
    print(rank,team, wins, losses) 

如果我們運行代碼,你可以看到數據匹配表:

In [4]: print(",".join([td.text for td in table.select("header div.td")])) 
Rank,Team,Wins,Losses 

In [5]: for row in table.select("div.tr"): 
    ...:   rank,team = row.select_one("span.numeric--medium.hcs-trend-neutral").text,row.select_one("div.td.hcs-title").span.a.text 
    ...:   wins, losses = [div.span.text for div in row.select("div.td.em-7")] 
    ...:   print(rank,team, wins, losses) 
    ...:  
1 Counter Logic Gaming 10 1 
2 Team EnVyUs 8 3 
3 Enigma6 8 3 
4 Renegades 6 5 
5 Team Allegiance 5 6 
6 Evil Geniuses 4 7 
7 OpTic Gaming 2 9 
8 Team Liquid 1 10 
+0

讓我測試這個答案之前驗證!非常感謝! –

+0

不用擔心,考驗;) –

+0

哇,老兄真的非常感謝。當我說我正盯着屏幕試圖弄清楚發生了什麼,這不是一個笑話!再次感謝!如果你能解釋'iteration block',它會更好! –

相關問題