2016-03-05 85 views
1

我想從url中提取數據,但在寫文件時我得到這個錯誤,因爲text不爲空。AttributeError:'ResultSet'對象沒有'編碼'屬性

我的代碼:

def gettextonly(self, url): 
     url = url 

     html = urllib.urlopen(url).read() 
     soup = BeautifulSoup(html) 

     # kill all script and style elements 
     for script in soup(["script", "style","a","<div id=\"bottom\" >"]): 
      script.extract() # rip it out 

     text = soup.findAll(text=True) 

     #print text 
     fo = open('foo.txt', 'w') 
     fo.seek(0, 2) 
     if text: 
      line =fo.writelines(text.encode('utf8')) 
     fo.close() 

錯誤:

in gettextonly 
    line =fo.writelines(text.encode('utf8')) 
AttributeError: 'ResultSet' object has no attribute 'encode' 

回答

4

soup.findAll(text=True)返回ResultSet對象,它是基本上不具有的屬性encode列表。要麼你想用的是.text代替:

text = soup.text 

或者說, 「加入」 的文本:

text = "".join(soup.findAll(text=True)) 
相關問題