2016-09-16 86 views
1

我想使用Python從網站http://www.footballlocks.com/nfl_odds.shtml中提取賠率信息。刮表信息

我一直在試着用BeautifulSoup做。

最佳結果是以字典或列表格式獲取賠率信息,因爲這些值將被輸入到數學公式中。

的機率信息的HTML代碼是:由此

<TABLE COLS="6" WIDTH="650" BORDER="0" CELLSPACING="5" CELLPADDING="2"> 

    <TR> 
    <TD WIDTH="19%"><span title="Date and Time of Game."><B>Date & Time</B></span></TD> 
    <TD WIDTH="21%"><span title="Team Spotting Points in a Bet Against the Point Spread."><B>Favorite</B></span></TD> 
    <TD WIDTH="14%"><span title="Short for Point Spread. Number of Points Subtracted from Final Score of Favorite to Determine Winner of a Point Spread Based Wager."><B>Spread</B></span></TD> 
    <TD WIDTH="21%"><span title="Team Receiving Points in a Bet With the Point Spread."><B>Underdog</B></span></TD> 
    <TD WIDTH="6%"><span title="Line for Betting Over or Under the Total number of Points Scored by Both Teams Combined. Synonymous With Over/Under."><B>Total</B></span></TD> 
    <TD WIDTH="19%"><span title="Money odds to Win the Game Outright, Without any Point Spread. 
Minus (-) is Amount Bettors Risk for Each $100 on the Favorite to Win the Game Outright. 
Plus (+) is Amount Bettors Win for Each $100 Risked on the Underdog to Win the Game Outright."><B>Money Odds</B></span></TD> 
    </TR> 






<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Detroit</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6</TD> 
    <TD>Tennessee</TD> 
    <TD>47</TD> 
    <TD>-$255 +$215</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Houston</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-2.5</TD> 
    <TD>Kansas City</TD> 
    <TD>43</TD> 
    <TD>-$140 +$120</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At New England</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6.5</TD> 
    <TD>Miami</TD> 
    <TD>42</TD> 
    <TD>-$290 +$240</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>Baltimore</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6.5</TD> 
    <TD>At Cleveland</TD> 
    <TD>42.5</TD> 
    <TD>-$300 +$250</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Pittsburgh</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-3.5</TD> 
    <TD>Cincinnati</TD> 
    <TD>48.5</TD> 
    <TD>-$180 +$160</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Washington</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-2.5</TD> 
    <TD>Dallas</TD> 
    <TD>45.5</TD> 
    <TD>-$145 +$125</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At NY Giants</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-4.5</TD> 
    <TD>New Orleans</TD> 
    <TD>53.5</TD> 
    <TD>-$225 +$185</TD> 
    </TR> 

<TR> 
    <TD>9/18 1:00 ET</TD> 
    <TD>At Carolina</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-13.5</TD> 
    <TD>San Francisco</TD> 
    <TD>45</TD> 
    <TD>-$900 +$600</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:05 ET</TD> 
    <TD>At Arizona</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-7</TD> 
    <TD>Tampa Bay</TD> 
    <TD>50</TD> 
    <TD>-$310 +$260</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:05 ET</TD> 
    <TD>Seattle</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6.5</TD> 
    <TD>At Los Angeles</TD> 
    <TD>38</TD> 
    <TD>-$290 +$240</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:25 ET</TD> 
    <TD>At Denver</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-6.5</TD> 
    <TD>Indianapolis</TD> 
    <TD>46.5</TD> 
    <TD>-$280 +$240</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:25 ET</TD> 
    <TD>At Oakland</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-4.5</TD> 
    <TD>Atlanta</TD> 
    <TD>49</TD> 
    <TD>-$210 +$180</TD> 
    </TR> 

<TR> 
    <TD>9/18 4:25 ET</TD> 
    <TD>At San Diego</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-3</TD> 
    <TD>Jacksonville</TD> 
    <TD>47</TD> 
    <TD>-$165 +$145</TD> 
    </TR> 


<TR> 
    <TD>9/18 8:30 ET</TD> 
    <TD>Green Bay</TD> 
    <TD>&nbsp;&nbsp;&nbsp;-2.5</TD> 
    <TD>At Minnesota</TD> 
    <TD>43.5</TD> 
    <TD>-$140 +$120</TD> 
    </TR> 




</TABLE> 

的Python代碼爲止。

from bs4 import BeautifulSoup 
import urllib 

url = "http://www.footballlocks.com/nfl_odds.shtml" 
html = urllib.urlopen(url) 

soup = BeautifulSoup(html, 'html.parser') 

for record in soup.find_all('tr'): 
    for data in record.find_all('td'): 
    print data.text 

PS。我的背景是經濟,我的編程經驗是有限的。

+1

究竟是你的問題嗎? – martineau

+0

我怎樣才能得到有關賠率的信息到字典中。例如 9/18 1:00 ET \t在北卡羅來納州\t -13.5 \t舊金山 - $ 900 + $ 600 是: {'在卡羅萊納州: - 900, '舊金山':+600 –

+1

您可能想編輯您的問題,以添加上面的「這是我的問題」的東西。 – JasonD

回答

1

它不是解析,因爲沒有課,我們可以用最好的HTML,但這會把所有的行插入類型的字典列表:

from bs4 import BeautifulSoup 
import requests 


url = "http://www.footballlocks.com/nfl_odds.shtml" 

soup = BeautifulSoup(requests.get(url).content) 

# Use the text of one of the headers to find the correct table 
table = soup.find("span", text="Date & Time").find_previous("table") 



data = [] 
# start from second tr 
for row in table.select("tr + tr"): 
    # index to get the tds we need 
    tds = [td.text for td in row.find_all("td")] 
    fav, under, odds = tds[1], tds[2], tds[-1] 
    # split money odds into fav/under odds 
    f_odds,u_odds = odds.split() 

    data.append({fav: f_odds.replace(u"$", ""), under : u_odds.replace(u"$", "")}) 
from pprint import pprint as pp 
pp(data) 

輸出:

[{u'At Detroit': u'-255', u'Tennessee': u'+215'}, 
{u'At Houston': u'-130', u'Kansas City': u'+110'}, 
{u'At New England': u'-290', u'Miami': u'+240'}, 
{u'At Cleveland': u'+225', u'Baltimore': u'-265'}, 
{u'At Pittsburgh': u'-175', u'Cincinnati': u'+155'}, 
{u'At Washington': u'-150', u'Dallas': u'+130'}, 
{u'At NY Giants': u'-215', u'New Orleans': u'+180'}, 
{u'At Carolina': u'-900', u'San Francisco': u'+600'}, 
{u'At Arizona': u'-330', u'Tampa Bay': u'+270'}, 
{u'At Los Angeles': u'+250', u'Seattle': u'-300'}, 
{u'At Denver': u'-275', u'Indianapolis': u'+235'}, 
{u'At Oakland': u'-210', u'Atlanta': u'+180'}, 
{u'At San Diego': u'-160', u'Jacksonville': u'+140'}, 
{u'At Minnesota': u'+115', u'Green Bay': u'-135'}] 
+0

非常感謝! –

+0

不用擔心,你實際上可能更好地存儲元組,所以你會知道順序,或者使用'{「在':{」team':u'At Detroit','spread':u'-255' },「away:{」team「:」u'Tennessee':「spread」:u'+ 215'}}'' –