BeautifulSoup HTMLParseError

Python新手，有一個簡單的情境問題：BeautifulSoup HTMLParseError

嘗試使用BeautifulSoup來分析一系列頁面。

from bs4 import BeautifulSoup 
import urllib.request 

BeautifulSoup(urllib.request.urlopen('http://bit.ly/'))

回溯...

html.parser.HTMLParseError: expected name token at '<!=KN\x01...

工作在Windows 7與Python 3.2 64位。

我需要機械化嗎？（這將需要Python 2.X）

來源

2012-03-23 Zack

如果該URL是正確的，你問爲什麼HTML解析器拋出一個解析MP3文件的錯誤。我認爲這個問題的答案是不言而喻的......

來源

2012-03-23 15:27:12 kindall

+11

：/謝謝你們。我是個白癡。 – Zack 2012-03-23 15:31:56

如果你想下載的MP3，你可以做這樣的事情：如果你想下載Python中的文件

import urllib2 

BLOCK_SIZE = 16 * 1024 

req = urllib2.urlopen("http://bit.ly/xg7enD") 
#Make sure to write as a binary file 
fp = open("someMP3.mp3", 'wb') 
try: 
    while True: 
    data = req.read(BLOCK_SIZE) 
    if not data: break 
    fp.write(data) 
finally: 
    fp.close()

來源

2012-08-30 23:40:28 ChicoBird

你可以使用這個以及

import urllib 
urllib.urlretrieve("http://bit.ly/xg7enD","myfile.mp3")

，這將您的文件保存在「myfile.mp3」名稱當前工作目錄。我可以通過它下載所有類型的文件。

希望它可以幫助！

來源

2016-02-07 18:46:10 sumit

代替urllib.request裏，我建議使用請求，並從這個LIB使用get（）

from requests import get 
from bs4 import BeautifulSoup 

soup = BeautifulSoup(
     get(url="http://www.google.com").content, 
     'html.parser' 
)

來源

2017-02-01 20:00:40

BeautifulSoup HTMLParseError

回答

相關問題