2016-11-28 62 views
1

返回的結果不是按順序排列,我需要按順序返回結果。BeautifulSoup循環不返回序列

試圖記錄排名。

def parse(self, response): 
    sourceHtml = BeautifulSoup(response.body) 
    soup = sourceHtml.find("dl", {"id": "resultList"}) 
    for link in soup.find_all('dd'): 
     print(link.get('code')) 

回答

1

如果你想在列表中有印有「碼」,只需使用一個"list comprehension"

def parse(self, response): 
    sourceHtml = BeautifulSoup(response.body) 
    soup = sourceHtml.find("dl", {"id": "resultList"}) 
    return [link.get('code') for link in soup.find_all('dd')] 

您還可以改善你的定位元素的方式,並使用CSS selector

def parse(self, response): 
    soup = BeautifulSoup(response.body) 
    return [link.get('code') for link in soup.select('dl#resultList dd')] 

這也是一個好主意,provide an underlying parser explicitly

soup = BeautifulSoup(response.body, "html.parser") 
# or soup = BeautifulSoup(response.body, "html5lib") 
# or soup = BeautifulSoup(response.body, "lxml")