這是我的全部跟蹤:redis.exceptions.ConnectionError後大約一天芹菜運行
Traceback (most recent call last):
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/app/trace.py", line 283, in trace_task
uuid, retval, SUCCESS, request=task_request,
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/base.py", line 256, in store_result
request=request, **kwargs)
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/base.py", line 490, in _store_result
self.set(self.get_key_for_task(task_id), self.encode(meta))
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 160, in set
return self.ensure(self._set, (key, value), **retry_policy)
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 149, in ensure
**retry_policy
File "/home/server/backend/venv/lib/python3.4/site-packages/kombu/utils/__init__.py", line 243, in retry_over_time
return fun(*args, **kwargs)
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 169, in _set
pipe.execute()
File "/home/server/backend/venv/lib/python3.4/site-packages/redis/client.py", line 2593, in execute
return execute(conn, stack, raise_on_error)
File "/home/server/backend/venv/lib/python3.4/site-packages/redis/client.py", line 2447, in _execute_transaction
connection.send_packed_command(all_cmds)
File "/home/server/backend/venv/lib/python3.4/site-packages/redis/connection.py", line 532, in send_packed_command
self.connect()
File "/home/pserver/backend/venv/lib/python3.4/site-packages/redis/connection.py", line 436, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 0 connecting to localhost:6379. Error.
[2016-09-21 10:47:18,814: WARNING/Worker-747] Data collector is not contactable. This can be because of a network issue or because of the data collector being restarted. In the event that contact cannot be made after a period of time then please report this problem to New Relic support for further investigation. The error raised was ConnectionError(ProtocolError('Connection aborted.', BlockingIOError(11, 'Resource temporarily unavailable')),).
我真的搜索ConnectionError但與我的沒有匹配的問題。
我的平臺是Ubuntu 14.04。這是我的redis配置的一部分。 (如果您需要整個redis.conf文件我可以共享所有的參數都在限制部分封閉的方式。)
# By default Redis listens for connections from all the network interfaces
# available on the server. It is possible to listen to just one or multiple
# interfaces using the "bind" configuration directive, followed by one or
# more IP addresses.
#
# Examples:
#
# bind 192.168.1.100 10.0.0.1
bind 127.0.0.1
# Specify the path for the unix socket that will be used to listen for
# incoming connections. There is no default, so Redis will not listen
# on a unix socket when not specified.
#
# unixsocket /var/run/redis/redis.sock
# unixsocketperm 755
# Close the connection after a client is idle for N seconds (0 to disable)
timeout 0
# TCP keepalive.
#
# If non-zero, use SO_KEEPALIVE to send TCP ACKs to clients in absence
# of communication. This is useful for two reasons:
#
# 1) Detect dead peers.
# 2) Take the connection alive from the point of view of network
# equipment in the middle.
#
# On Linux, the specified value (in seconds) is the period used to send ACKs.
# Note that to close the connection the double of the time is needed.
# On other kernels the period depends on the kernel configuration.
#
# A reasonable value for this option is 60 seconds.
tcp-keepalive 60
這是我的小Redis的包裝:
import redis
from django.conf import settings
REDIS_POOL = redis.ConnectionPool(host=settings.REDIS_HOST, port=settings.REDIS_PORT)
def get_redis_server():
return redis.Redis(connection_pool=REDIS_POOL)
這是我如何使用它:
from redis_wrapper import get_redis_server
# view and task are working in different, indipendent processes
def sample_view(request):
rs = get_redis_server()
# some get-set stuff with redis
@shared_task
def sample_celery_task():
rs = get_redis_server()
# some get-set stuff with redis
包版本:
celery==3.1.18
django-celery==3.1.16
kombu==3.0.26
redis==2.10.3
所以問題在於;這個連接錯誤發生在一段時間後,啓動芹菜工人。在看到這個錯誤後,所有的任務都以這個錯誤結束,直到我重新啓動了所有的芹菜工作者。 (有趣的是,芹菜花也在這個問題期間失敗)
我懷疑我的redis連接池使用方法,或redis配置或不太可能的網絡問題。任何想法的原因?我究竟做錯了什麼?
(PS:我會添加Redis的-CLI信息的結果,當我今天看到這個錯誤)
UPDATE:
我暫時加入--maxtasksperchild參數到我的工作人員啓動命令解決了這個問題。我把它設置爲200.當然這不是解決這個問題的正確方法,它只是一個症狀治療。它基本上定期刷新工作者實例(關閉舊的進程並在舊的進程達到200任務時創建新進程)並刷新我的全局Redis池和連接。 所以我認爲我應該關注全局redis連接池的使用方式,我仍在等待新的想法和意見。
對不起,我的英語不好,並提前致謝。
由於發行
config set stop-writes-on-bgsave-error no
命令停止後臺保存過程。我剛剛檢查過,它是589MB。但它是redis將它用於持久性目的的文件,對嗎?我的意思是,如果我禁用它,在機器重新啓動後,我將失去我的隊列,不是嗎?我也在一天內檢查幾次磁盤大小。最後,如果我重新啓動工作人員,是否可以減少轉儲文件的大小?我的意思是,我想,症狀不匹配。 –
提供的命令不是爲了停止後臺保存,而是爲了防止redis在發生關於後臺保存方法的錯誤時停止。 –
我明白了。但是我們必須考慮到,在錯誤時間內,我可以通過redis-cli訪問redis數據。所以Redis並沒有完全停止,只能阻止我的芹菜客戶。 –