TypeError：在BeautifulSoup中使用Python進行分割時無法調用'NoneType'對象

我今天在玩BeautifulSoup和Requests API。所以我想我會寫一個簡單的刮板，它會跟隨深度爲2的鏈接（如果有意義的話）。我所刮的網頁中的所有鏈接都是相對的。（例如：<a href="/free-man-aman-sethi/books/9788184001341.htm" title="A Free Man">）所以爲了使它們絕對，我以爲我會加入頁面的網址與相關鏈接使用urljoin。TypeError：在BeautifulSoup中使用Python進行分割時無法調用'NoneType'對象

要做到這一點，我不得不首先從<a>標籤提取href值和，我想我會用split：

#!/bin/python 
#crawl.py 
import requests 
from bs4 import BeautifulSoup 
from urlparse import urljoin 

html_source=requests.get("http://www.flipkart.com/books") 
soup=BeautifulSoup(html_source.content) 
links=soup.find_all("a") 
temp=links[0].split('"')

這提供了以下錯誤：

Traceback (most recent call last): 
    File "test.py", line 10, in <module> 
    temp=links[0].split('"') 
TypeError: 'NoneType' object is not callable

有在正確地瀏覽文檔之前先深入瞭解，我意識到這可能不是實現我的目標的最佳方式，但爲什麼會出現TypeError？

來源

2013-03-14 Chaitanya Nettem

links[0]不是一個字符串，它是一個bs4.element.Tag。當你試圖在其中查找split時，它會發揮它的魔力並試圖找到一個名爲split的子元素，但是沒有。你正在調用None。

In [10]: l = links[0] 

In [11]: type(l) 
Out[11]: bs4.element.Tag 

In [17]: print l.split 
None 

In [18]: None() # :) 

TypeError: 'NoneType' object is not callable

使用索引來查找HTML屬性：

In [21]: links[0]['href'] 
Out[21]: '/?ref=1591d2c3-5613-4592-a245-ca34cbd29008&_pop=brdcrumb'

或者get是否有不存在的屬性的危險：

In [24]: links[0].get('href') 
Out[24]: '/?ref=1591d2c3-5613-4592-a245-ca34cbd29008&_pop=brdcrumb' 


In [26]: print links[0].get('wharrgarbl') 
None 

In [27]: print links[0]['wharrgarbl'] 

KeyError: 'wharrgarbl'

來源

2013-03-14 12:25:04

因爲Tag類使用代理來訪問屬性（正如Pavel指出的那樣 - 在可能的情況下用於訪問子元素），因此在未找到缺省值的情況下返回None。

錯綜複雜的例子：

>>> print soup.find_all('a')[0].bob 
None 
>>> print soup.find_all('a')[0].foobar 
None 
>>> print soup.find_all('a')[0].split 
None

您需要使用：

soup.find_all('a')[0].get('href')

其中：

>>> print soup.find_all('a')[0].get 
<bound method Tag.get of <a href="test"></a>>

來源

2013-03-14 12:25:08

子元素，而不是屬性。 – 2013-03-14 12:26:32

我正好遇到了同樣的錯誤 - 所以這是非常值得四年後：如果你需要分割湯元素，你也可以在分割之前使用str（）。在你的情況下，將是：

temp = str(links).split('"')

來源

2017-06-07 08:12:44 Ollie

TypeError：在BeautifulSoup中使用Python進行分割時無法調用'NoneType'對象

回答

相關問題