0
頁
我嘗試以下操作:Beautifulsoup無法讀取
from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup
url = 'http://propaccess.traviscad.org/clientdb/Property.aspx?prop_id=312669'
soup = BeautifulSoup(urlopen(url).read())
print soup
以上print
聲明顯示如下:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html;charset=utf-8" />
<title>Travis Property Search</title>
<style type="text/css">
body { text-align: center; padding: 150px; }
h1 { font-size: 50px; }
body { font: 20px Helvetica, sans-serif; color: #333; }
#article { display: block; text-align: left; width: 650px; margin: 0 auto; }
a { color: #dc8100; text-decoration: none; }
a:hover { color: #333; text-decoration: none; }
</style>
</head>
<body>
<div id="article">
<h1>Please try again</h1>
<div>
<p>Sorry for the inconvenience but your session has either timed out or the server is busy handling other requests. You may visit us on the the following website for information, otherwise please retry your search again shortly:<br /><br />
<a href="http://www.traviscad.org/">Travis Central Appraisal District Website</a> </p>
<p><b><a href="http://propaccess.traviscad.org/clientdb/?cid=1">Click here to reload the property search to try again</a></b></p>
</div>
</div>
</body>
</html>
我能夠通過同一臺計算機上的瀏覽器但訪問URL所以服務器絕對不會阻止我的IP。我不明白我的代碼有什麼問題?
'從BeautifulSoup進口BeautifulSoup'應該不會是'從BS4進口BeautifulSoup'? –
@MD。 Khairul Basar是的,這就是我通常導入它的方式,但它可以工作。 –
你爲什麼要導入cookies ..我嘗試過的其他例子從來不需要cookies – Zanam