爬行蜘蛛不進入下一頁

-1

我在http://www.ulta.com/makeup-eyes-eyebrows?N=26yi上刮所有產品的詳細信息。我的規則複製如下。我只從第一頁獲得數據，而不會進入下一頁。爬行蜘蛛不進入下一頁

rules = (Rule(LinkExtractor(
      restrict_xpaths='//*[@id="canada"]/div[4]/div[2]/div[3]/div[3]/div[2]/ul/li[3]/a',), 
      callback = 'parse', 
      follow =True),)

任何人都可以幫助我嗎？

來源

2017-07-03 Zhuoyang Li

使用CrawlSpider在下面的問題中提到，https://stackoverflow.com/questions/32624033/scrapy-crawl-with-next-page –

我認爲我的代碼完全遵循上面鏈接中的爬行蜘蛛。但不起作用 –

使用CrawlSpider，它會自動抓取到其他頁面，否則用，蜘蛛，你需要手動傳遞等環節

class Scrapy1Spider(CrawlSpider):

代替

class Scrapy1Spider(scrapy.Spider):

看到：Scrapy crawl with next page

來源

2017-07-03 12:01:20

我使用爬行蜘蛛而不是蜘蛛。而restrict_xpaths是下一個按鈕的xpath。但它只是刮掉第一頁。 –

檢查其他鏈接是否爲allowed_domains變量的一部分。爲什麼你不在LinkExtractor中添加allow（）。 –

問題解決了。抓取第一頁時出現產品錯誤。 –

爬行蜘蛛不進入下一頁

回答

相關問題