無法將scrapy連接到我的數據庫

我必須執行爬蟲並將數據放入數據庫中。我已經收集了我的數據，但我有問題將它們放入數據庫中。無法將scrapy連接到我的數據庫

我的文件有：

topcrawlerspider.py（我的履帶，他是fonctional）：

from scrapy import Spider, Item, Field, Request 
from ..items import TopcrawlerItem 
from ..pipelines import TopcrawlerPipeline 
import time 

class TopSpider(Spider): 

name = 'topcrawler' 
start_urls = ['...'] 

def __init__(self, page=0, *args, **kwargs): 
    super(TopSpider, self).__init__(*args, **kwargs) 
    self.search_result_url_tpl = 'http://.../%s' 
...

settings.py：

BOT_NAME = 'topcrawler' 

SPIDER_MODULES = ['topcrawler.spiders'] 
NEWSPIDER_MODULE = 'topcrawler.spiders' 


# Crawl responsibly by identifying yourself (and your website) on the 
user-agent 
#USER_AGENT = 'topcrawler (+http://www.yourdomain.com)' 

# Obey robots.txt rules 
ROBOTSTXT_OBEY = True 

ITEM_PIPELINES = { 
'topcrawler.pipelines.TopcrawlerPipeline': 300, 
# 'topcrawler.pipelines.JsonWriterPipeline': 800, 
} 

MONGODB_URI = 'mongodb://root:[email protected]:8889/mtdbdd' 
MONGO_DATABASE = 'mtdbdd'

pipelines.py：

import pymongo 
from settings import * 

class TopcrawlerPipeline(object): 

collection_name = 'land' 

def __init__(self, mongo_uri, mongo_db): 
    self.mongo_uri = mongo_uri 
    self.mongo_db = mongo_db 

@classmethod 
def from_crawler(cls, crawler): 
    return cls(
     mongo_uri=crawler.settings.get('MONGO_URI'), 
     mongo_db=crawler.settings.get('MONGO_DATABASE', 'items') 
    ) 

def open_spider(self, spider): 
    self.client = pymongo.MongoClient(self.mongo_uri) 
    self.db = self.client[self.mongo_db] 

def close_spider(self, spider): 
    self.client.close() 

def process_item(self, item, spider): 
    self.db[self.collection_name].insert(dict(item)) 
    return item

我有t他錯誤：

ServerSelectionTimeoutError: localhost:27017: [Errno 8] nodename nor servname provided, or not known

它似乎它沒有連接到端口8889像我想要的，但我不undertand爲什麼...

感謝茜幫助！

來源

2017-09-14 bastien le quéré

在你TopcrawlerPipeline類和方法open_spider（在pipelines.py文件），你有重複client創作：

self.client = pymongo.MongoClient(connect=False) 
self.client = 
    pymongo.MongoClient('mongodb://root:[email protected]:8889/mtdbdd')

我敢打賭，錯誤來自於第一個（我以爲是無意的）。刪除第一個，只留下第二個。

只是一個旁註，來說明錯誤可能來自哪裏。如果您沒有在MongoClient中指定連接字符串，它會嘗試連接到本地主機並且默認端口爲27017.請檢查您的/etc/hosts文件以瞭解localhost的定義方式（我假設您在Linux上）。在某些系統上，只有IPv6地址分配給localhost，默認情況下MongoDB不監聽IPv6地址。

來源

2017-09-14 12:35:51

嗨！謝謝你的回答。我編輯我的文件pipeline.py（就像我在後編輯），但我又有同樣的錯誤：/ –

在你的'pipelinelines.py'你使用'MONGO_URI'，但在'settings.py'你定義'MONGODB_URI '。這可能是錯誤的根源。請檢查出來。 –

無法將scrapy連接到我的數據庫

回答

相關問題