Scrapy引薦無

-2

你好，我是想湊使用scrapy加拿大名錄。這是我的蜘蛛代碼：Scrapy引薦無

import scrapy 


class YellSpider(scrapy.Spider): 
    name = 'yellspider' 

    start_urls = ['http://www.yellowpages.ca/search/si/40/dentist/Toronto+ON'] 

    def start_requests(self): 

     urls = ['http://www.yellowpages.ca/search/si/{0}/dentist/Toronto+ON'.format(x) for x in xrange(1, 51)] 
     for u in urls: 
      yield scrapy.Request(url=u, callback=self.parse, dont_filter=True) 

    def parse(self, response): 
     for job in response.css(".listing.listing--bottomcta.placement"): 
      yield { 
       'name': job.css(".listing__name--link::text").extract_first(), 
       'street': job.css(".jsMapBubbleAddress:nth-child(1)::text").extract_first(), 
       'locality': job.css(".jsMapBubbleAddress:nth-child(2)::text").extract_first(), 
       'postalCode': job.css(".jsMapBubbleAddress:nth-child(4)::text").extract_first(), 
       'website': job.css(".mlr__item.mlr__item--website a::attr(href)").re(r'\?(.*)'), 
       'phone': job.css(".mlr__submenu__item h4::text").extract_first(default='no phone number') 
      }

比如我知道，搜索結果都只有50頁，所以我沒有創建一個使用列表中的URL列表理解。比我用css選擇器來尋找我想刮的內容。

現在讓我們深入到問題：直到我到達頁[28〜50]這是輸出的樣子

click here to see the output image

一切工作正常：我確實更改了USER_AGENT我也添加了DOWNLOAD_DELAY = 3並且還嘗試添加引薦的頭，但沒有奏效

別的提的是，在scrapy外殼選擇工作正常的網頁[28〜50]

來源

2017-02-28 S.rommaissa