如果你想在列表中有印有「碼」,只需使用一個"list comprehension":
def parse(self, response):
sourceHtml = BeautifulSoup(response.body)
soup = sourceHtml.find("dl", {"id": "resultList"})
return [link.get('code') for link in soup.find_all('dd')]
您還可以改善你的定位元素的方式,並使用CSS selector:
def parse(self, response):
soup = BeautifulSoup(response.body)
return [link.get('code') for link in soup.select('dl#resultList dd')]
這也是一個好主意,provide an underlying parser explicitly:
soup = BeautifulSoup(response.body, "html.parser")
# or soup = BeautifulSoup(response.body, "html5lib")
# or soup = BeautifulSoup(response.body, "lxml")