2010-11-04 141 views
3

我使用Python函數urllib2.urlopen來閱讀http://www.bad.org.uk/網站,但即使訪問網站時它仍然會收到302錯誤,但它仍然可以正常加載。任何人有任何想法爲什麼?Python urllib2.urlopen即使頁面存在也返回302錯誤

import socket 

headers = { 'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' } 

socket.setdefaulttimeout(10) 

try: 
    req = urllib2.Request('http://www.bad.org.uk/', None, headers) 
    urllib2.urlopen(req) 
    return True   # URL Exist 
except ValueError, ex: 
    print 'URL: %s not well formatted' % 'http://www.bad.org.uk/' 
    return False  # URL not well formatted 
except urllib2.HTTPError, ex: 
    print 'The server couldn\'t fulfill the request for %s.' % 'http://www.bad.org.uk/' 
    print 'Error code: ', ex.code 
    return False 
except urllib2.URLError, ex: 
    print 'We failed to reach a server for %s.' % 'http://www.bad.org.uk/' 
    print 'Reason: ', ex.reason 
    return False  # URL don't seem to be alive 

錯誤印刷:

The server couldn't fulfill the request for http://www.bad.org.uk//site/1/default.aspx. 
Error code: 302 

回答

18

當cookie被禁用時,http://www.bad.org.uk/的頁面被破壞。

http://www.bad.org.uk/回報:

HTTP/1.1 302 Found 
Location: http://www.bad.org.uk/DesktopDefault.aspx 
Set-Cookie: Esperantus_Language_bad=en-GB; path=/ 
Set-Cookie: Esperantus_Language_rainbow=en-GB; path=/ 
Set-Cookie: PortalAlias=rainbow; path=/ 
Set-Cookie: refreshed=true; expires=Thu, 04-Nov-2010 16:21:23 GMT; path=/ 
Set-Cookie: .ASPXAUTH=; expires=Mon, 11-Oct-1999 23:00:00 GMT; path=/; HttpOnly 
Set-Cookie: portalroles=; expires=Mon, 11-Oct-1999 23:00:00 GMT; path=/ 

如果我再請求http://www.bad.org.uk/DesktopDefault.aspx沒有設置這些cookie,它給出了另一個302和重定向到自身。

urllib2正在忽略cookie併發送沒有cookie的新請求,因此它會在該URL處導致重定向循環。要處理此問題,您需要添加Cookie處理程序:

import urllib2 
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor()) 
response = opener.open('http://www.bad.org.uk') 
print response.read() 
4

碼302是一個臨時重定向,所以你應該從響應的位置字段中得到URI和要求。

+0

我該怎麼做?對不起,我對Python很陌生,在 – John 2010-11-04 16:19:35

+0

之前沒有使用過urllib2 @John - 這是另外一個問題! – 2010-11-04 16:27:17

+2

302s默認由urllib2自動處理。 – 2010-11-04 16:32:27

相關問題