2017-03-16 116 views
0

此捲曲有效。無法在Scrapy中使用帶有用戶名和密碼的API

https://user:[email protected]/v1/convert_from.json/?from=1000000&to=SGD&amount=AED,AUD,BDT&inverse=True 

但是這個Scrapy請求不起作用。

yield scrapy.Request("https://justanalyticspteltd65986537:[email protected]/v1/convert_from.json/?from=1000000&to=SGD&amount=AED,AUD,BDT&inverse=True") 

It returns this error: 

Traceback (most recent call last): 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\internet\defer.py", line 1297, in _inlineCallbacks 
    result = result.throwExceptionIntoGenerator(g) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\python\failure.py", line 389, in throwExceptionIntoGenerator 
    return g.throw(self.type, self.value, self.tb) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\middleware.py", line 43, in process_request 
    defer.returnValue((yield download_func(request=request,spider=spider))) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\utils\defer.py", line 45, in mustbe_deferred 
    result = f(*args, **kw) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\handlers\__init__.py", line 65, in download_request 
    return handler.download_request(request, spider) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 61, in download_request 
    return agent.download_request(request) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 286, in download_request 
    method, to_bytes(url, encoding='ascii'), headers, bodyproducer) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\web\client.py", line 1596, in request 
    endpoint = self._getEndpoint(parsedURI) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\web\client.py", line 1580, in _getEndpoint 
    return self._endpointFactory.endpointForURI(uri) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\web\client.py", line 1456, in endpointForURI 
    uri.port) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\scrapy\core\downloader\contextfactory.py", line 59, in creatorForNetloc 
    return ScrapyClientTLSOptions(hostname.decode("ascii"), self.getContext()) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\internet\_sslverify.py", line 1201, in __init__ 
    self._hostnameBytes = _idnaBytes(hostname) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\twisted\internet\_sslverify.py", line 87, in _idnaBytes 
    return idna.encode(text) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\idna\core.py", line 355, in encode 
    result.append(alabel(label)) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\idna\core.py", line 276, in alabel 
    check_label(label) 
    File "d:\kerja\hit\python~1\justan~1\curren~1\lib\site-packages\idna\core.py", line 253, in check_label 
    raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label))) 
InvalidCodepoint: Codepoint U+003A at position 28 of u'xxxxxxxxxxxxxxxxxxxxxxxxxxxx:[email protected]' not allowed 
+0

你忘了你的憑據在第二個網址。而代碼500意味着服務器在處理您的請求時遇到錯誤,所以它有些問題。 – Granitosaurus

+0

我更新我的問題。以前,我沒有關閉Crawlera –

回答

1

Scrapy不支持通過URL進行HTTP身份驗證。我們必須改用HTTPAuthMiddleware。

settings.py

DOWNLOADER_MIDDLEWARES = { 
    'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware': 811, 
} 
在蜘蛛

from scrapy.spiders import CrawlSpider 

class SomeIntranetSiteSpider(CrawlSpider): 

    http_user = 'someuser' 
    http_pass = 'somepass' 
    name = 'intranet.example.com' 

    # .. rest of the spider code omitted ... 
+0

請注意,有一個開放的Pull Request,其中包含一個從URL讀取證書的實現:https://github.com/scrapy/scrapy/pull/1466 –

相關問題