我給出了一個指向HTML頁面的鏈接。如何打開它並使用其絕對XPath獲取特定元素的內容。使用Python提取HTML頁面元素的內容
from lxml import html
import requests
page = requests.get('http://www.professorpaddle.com/rivers/riverlist.asp')
tree = html.fromstring(page.content)
table_data=[]
temp_dict={}
temp = tree.xpath('//a[@class="pathm"]')
for i in temp:
link=i.attrib.get('href')
link="http://www.professorpaddle.com/rivers/"+link
temp_dict['name']=i.text
temp_dict['link']=link
print(link)
temp_page=requests.get(link)
temp_tree=html.fromstring(temp_page.content)
x=temp_tree.xpath('/html/body/element/table/tbody/tr[2]/td/table/tbody/tr/td/table[1]/tbody/tr[2]/td[3]/table/tbody/tr[3]/td[2]/font')
print(x)
break
你嘗試的東西嗎? – Dekel
是的,但我如何發佈我的代碼? – FibonacciCoder
選中此項:http://stackoverflow.com/editing-help – Dekel