0
我想在elasticsearch的Python - elasticsearch.exceptions.RequestError
提取數據和我的功能是這樣的:
##Using regex to get the image name.
#it is inefficient to fetch them one by one using doc['hits']['hits'][n]['_source']['docker_image_short_name']
#because thousands of documents are stored per images
regex = "docker_image_short_name': u'(.+?)'"
pattern=re.compile(regex)
query={
"query":{
"bool":{ "must":[{"range":{"@timestamp":{"gt":vulTime}}}] }
}
}
page = es.search(index='crawledframe-*', body = query, scroll='1m', size=1000)
sid = page['_scroll_id']
num_page = page['hits']['total']
imglist=[]
while num_page > 0:
print num_page
print vulTime
imgs = re.findall(pattern, str(page))
imglist += imgs
page = es.scroll(scroll_id = sid, scroll = '1m')
num_page = len(page['hits']['hits'])
imglist = list(set(imglist))#remove duplicaton
我想只提取 「docker_image_short_name」
但是,我得到錯誤(打印結果):
num_page: 2327261
vulTime : 0001-01-01
Traceback (most recent call last):
File "test.py", line 68, in <module>
worker_main()
File "test.py", line 63, in worker_main
imgnames = recent_crawl_index(es, vulTime)
File "test.py", line 45, in recent_crawl_index
page = es.scroll(scroll_id = sid, scroll = '1m')
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 1024, in scroll
params=params, body=body)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 312, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 128, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py", line 125, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: <exception str() failed>
我不知道爲什麼會發生這個呃ROR,因爲我用同樣的邏輯在其他代碼
和es.search()並沒有出現錯誤...