2
以下:scrapy's教程我做了一個簡單的圖像爬蟲(擦除布加迪斯的圖像)。下面在中舉例說明示例。Scrapy:圖像管道,下載圖像
但是,按照指南給我留下了一個不起作用的爬蟲!它找到所有的網址,但它不下載圖像。
我發現鴨膠帶解決方案:取代ITEM_PIPELINES
和IMAGES_STORE
這樣的;
ITEM_PIPELINES['scrapy.pipeline.images.FilesPipeline'] = 1
和
IMAGES_STORE
- >FILES_STORE
但我不知道爲什麼這個工程?我想使用scrapy記錄的ImagePipeline。
例
settings.py
BOT_NAME = 'imagespider'
SPIDER_MODULES = ['imagespider.spiders']
NEWSPIDER_MODULE = 'imagespider.spiders'
ITEM_PIPELINES = {
'scrapy.pipelines.images.ImagesPipeline': 1,
}
IMAGES_STORE = "/home/user/Desktop/imagespider/output"
items.py
import scrapy
class ImageItem(scrapy.Item):
file_urls = scrapy.Field()
files = scrapy.Field()
imagespider.py
from imagespider.items import ImageItem
import scrapy
class ImageSpider(scrapy.Spider):
name = "imagespider"
start_urls = (
"https://www.find.com/search=bugatti+veyron",
)
def parse(self, response):
for elem in response.xpath("//img"):
img_url = elem.xpath("@src").extract_first()
yield ImageItem(file_urls=[img_url])
謝謝!你也可以將'ImageItem'改成'image_urls'和'yield ImageItem(image_urls = [img_url])' –