2017-02-26 71 views
0

我已經運行與谷歌的鏈接,提供你好的搜索結果,但有錯誤錯誤下載的任何URL在Python

代碼(蜘蛛代碼)

import scrapy 
import re 
class LinsSpider(scrapy.Spider): 
    name = "lins" 
    allowed_domains = ["www.google.com"] 
    start_urls = ('https://www.google.co.in/?gfe_rd=cr&ei=78uyWPjFH8WL8Qe7kKf4BA#q=hello&*',) 
    def parse(self, response): 
     pagestr = "[email protected]" 

    yield 
    { 
      'asin' : str(re.search("^[A-Za-z0-9\.\+_-][email protected][A-Za-z0-9\._-]+\.[a-zA-Z]*$",pagestr).group(1).strip()), 
    } 

和錯誤是

簡單scrapy蜘蛛
2017-02-26 18:06:11 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023 
2017-02-26 18:06:11 [scrapy] ERROR: Error downloading <GET http://www.google.com/> 
Traceback (most recent call last): 
    File "/usr/lib/python2.7/dist-packages/scrapy/utils/defer.py", line 45, in mustbe_deferred 
    result = f(*args, **kw) 
    File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/__init__.py", line 41, in download_request 
    return handler(request, spider) 
    File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/http11.py", line 44, in download_request 
    return agent.download_request(request) 
    File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/http11.py", line 211, in download_request 
    d = agent.request(method, url, headers, bodyproducer) 
    File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1631, in request 
    parsedURI.originForm) 
    File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1408, in _requestWithEndpoint 
    d = self._pool.getConnection(key, endpoint) 
    File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1294, in getConnection 
    return self._newConnection(key, endpoint) 
    File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1306, in _newConnection 
    return endpoint.connect(factory) 
    File "/usr/local/lib/python2.7/dist-packages/twisted/internet/endpoints.py", line 788, in connect 
    EndpointReceiver, self._hostText, portNumber=self._port 
    File "/usr/local/lib/python2.7/dist-packages/twisted/internet/_resolver.py", line 174, in resolveHostName 
    onAddress = self._simpleResolver.getHostByName(hostName) 
    File "/usr/lib/python2.7/dist-packages/scrapy/resolver.py", line 21, in getHostByName 
    d = super(CachingThreadedResolver, self).getHostByName(name, timeout) 
    File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 276, in getHostByName 
    timeoutDelay = sum(timeout) 
TypeError: 'float' object is not iterable 
2017-02-26 18:06:11 [scrapy] INFO: Closing spider (finished) 
2017-02-26 18:06:11 [scrapy] INFO: Dumping Scrapy stats: 

請幫我解決這個問題,我有Ubuntu的16.10

+0

請填寫完整的代碼。我們無法運行您提供的代碼並獲得相同的結果。 –

+0

我使用'startproject links'創建了Scrapy項目,並且使用'genspider lins'創建了spider,'lins.py'文件的代碼是我在我的問題中編寫的 –

回答

1

我找到問題。 這是扭曲的版本太高了,你可以把它改成16.6.0,並且它工作成功!