如何申請一個已經引用的網址？

較早的代碼給了我這個網址：http://en.wikipedia.org/wiki/M%C3%BCnster。現在，我想要求它，但找不出一個辦法做到這一點：如何申請一個已經引用的網址？

>>> requests.get('http://en.wikipedia.org/wiki/M%C3%BCnster') 
<Response [400]> 
>>> requests.get(urlparse.unquote('http://en.wikipedia.org/wiki/M%C3%BCnster')) 
<Response [400]> 
>>> requests.get(urlparse.unquote('http://en.wikipedia.org/wiki/M%C3%BCnster').decode('utf-8')) 
<Response [400]>

的問題是，請求試圖過於聰明報價和實際要求爲：

Request URI: /wiki/M%25C3%25BCnster 
Request URI: /wiki/M%25C3%25BCnster 
Request URI: /wiki/M%25C3%25BCnster

任何想法？

來源

2012-02-13 lRem

不適用於urllib也不適用urllib2，但會給出錯誤403而不是... – lRem 2012-02-13 21:59:23

似乎urllib *的問題是因爲維基百科服務器對它的不滿，與請求問題無關。 – lRem 2012-02-14 00:01:02

什麼是'請求'？ – maciej 2012-02-14 01:08:41

這是請求中的錯誤。它已經被固定在develop分支。請參閱：https://github.com/kennethreitz/requests/pull/387。

來源

2012-02-14 10:31:44 lRem

嘗試添加.decode('utf-8')：自定義User-Agent頭

requests.get(urlparse.unquote('http://en.wikipedia.org/wiki/M%C3%BCnster').decode('utf-8'))

來源

2012-02-13 21:12:03 Amber

不，不適合我。將其添加到上面的描述中。 – lRem 2012-02-13 21:29:32

簡單urlparse.unquote似乎做的工作。

>>> s = 'http://en.wikipedia.org/wiki/M%C3%BCnster' 
>>> import urllib2, urlparse 
>>> headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; rv:9.0.1) Gecko/20100101 Firefox/9.0.1'} 
>>> url = urlparse.unquote(s) 
>>> req = urllib2.Request(url, None, headers) 
>>> resp = urllib2.urlopen(req) 
>>> print resp.code 
200 
>>> data = resp.read() 
>>> print 'The last outstanding palace of the German baroque period is created according to plans by Johann Conrad Schlaun.' in data 
True

字節字符串不解碼成unicode的對象，它會導致UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 11: ordinal not in range(128)中的urlopen。

來源

2012-02-14 01:01:14 maciej

不錯。但我希望使用請求的解決方案。它與urllib不同，它非常好，但如果我無法解決這個問題，我需要恢復到urllib：/ – lRem 2012-02-14 09:07:58

如何申請一個已經引用的網址？

回答

相關問題