4
我的代碼...蟒蛇lxml.html.soupparser.fromstring提高惱人的警告
foo = fromstring(my_html)
它提出了這樣的警告......
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
To get rid of this warning, change this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "html.parser")
markup_type=markup_type))
我試圖傳遞給它的字符串'html.parser'
但不起作用,因爲它給我一個錯誤,說該字符串不是可調用的,所以我嘗試html.parser
,然後我查看了lxml模塊,看看我能否找到另一個解析器,而不能。我查看了python stdlib,發現在2.7中有一個叫HTMLParser
,所以我導入並輸入了beautifulsoup=HTMLParser
,那也沒用。
我應該傳遞給fromstring
的可調用函數在哪裏?
編輯添加嘗試的解決方案:
from lxml.html.soupparser import fromstring
wiktionary_page = fromstring(wiktionary_page.read(), features="html.parser")
這
from lxml.html.soupparser import BeautifulSoup
wiktionary_page = fromstring(wiktionary_page.read(), beautifulsoup=lambda s: BeautifulSoup(s, "html.parser"))
很好的想法,但是這些都爲我工作 – deltaskelta
對我來說這兩種工作,你使用它完全按照貼? –
我添加了我所嘗試的功能與您的功能相同,據我所知 – deltaskelta