我保存的.html文件中解析表,它看起來像BeautifulSoup解析值:Python中,使用從表
的HTML代碼是這樣的:
<table id="detailBody" width="100%" cellspacing="0" cellpadding="0" border="0" class="tab2" style="display: block;"><tbody>
<tr><td><ul><li><span>15:00:19</span><span class="red">11.750</span><span class="red">5392</span><span class="fr red">↑</span></li><li><span>14:56:55</span><span class="red">11.750</span><span class="red">17</span><span class="fr red">↑</span></li><li><span>14:56:52</span><span class="red">11.750</span><span class="red">479</span><span class="fr red">↑</span></li><li><span>14:56:49</span><span class="">11.740</span><span class="green">6</span><span class="fr green">↓</span></li><li><span>14:56:46</span><span class="">11.740</span><span class="green">333</span><span class="fr green">↓</span></li><li><span>14:56:43</span><span class="">11.740</span><span class="green">21</span><span class="fr green">↓</span></li><li><span>14:56:40</span><span class="">11.740</span><span class="green">15</span><span class="fr green">↓</span></li><li><span>14:56:37</span><span class="">11.740</span><span class="green">35</span><span class="fr green">↓</span></li><li><span>14:56:34</span><span class="red">11.750</span><span class="red">11</span><span class="fr red">↑</span></li><li><span>14:56:31</span><span class="">11.740</span><span class="green">3</span><span class="fr green">↓</span></li><li><span>14:56:28</span><span class="">11.740</span><span class="green">24</span><span class="fr green">↓</span></li><li><span>14:56:22</span><span class="red">11.750</span><span class="red">291</span><span class="fr red">↑</span></li><li><span>14:56:19</span><span class="">11.740</span><span class="red">198</span><span class="fr red">↑</span></li><li><span>14:56:16</span><span class="green">11.730</span><span class="green">15</span><span class="fr green">↓</span></li></ul></td></tr>
</tbody></table>
什麼我到目前爲止是:
list_a = soup.find_all('table')[0].tbody.find_all("tr")
for a in list_a:
for b in a:
for c in b:
for d in c:
for e in d:
print e.renderContents()
即使它看起來不是很好,結果如下:
15:00:19
11.750
5392
↑
14:56:55
11.750
17
↑
14:56:52
11.750
479
↑
但是表中有太多內容,我只想要表中的前10組數據。只有第三和第四項放在2個列表中。
即
[「5392」, 「17」, 「479」, …]
和
[「↑」, 「↑」, 「↑」, …] #the 「↑」 can be changed to something else identical if it's a problem
我怎麼能做到這一點?謝謝。
添加HTML的不是圖像。 – SIslam
我想他應該說,你應該添加實際的html代碼,而不僅僅是圖片,以便我們可以更好地幫助你; – nablahero
@SIslam和nablahero,感謝您的評論。 –