2016-11-14 149 views
1

嘿,我有一個羣集ID不匹配出於某種原因,我是有1個節點上,然後清除數據目錄幾次,改變簇標記和節點名稱後disapperead,但apperead另一個ETCD集羣ID mistmatch

這裏我使用

IP0=10.150.0.1 
IP1=10.150.0.2 
IP2=10.150.0.3 
IP3=10.150.0.4 
NODENAME0=node0 
NODENAME1=node1 
NODENAME2=node2 
NODENAME3=node3 

# changing these on each box 
THISIP=$IP2 
THISNODENAME=$NODENAME2 

etcd --name $THISNODENAME --initial-advertise-peer-urls http://$THISIP:2380 \ 
--data-dir /root/etcd-data \ 
--listen-peer-urls http://$THISIP:2380 \ 
--listen-client-urls http://$THISIP:2379,http://127.0.0.1:2379 \ 
--advertise-client-urls http://$THISIP:2379 \ 
--initial-cluster-token etcd-cluster-2 \ 
--initial-cluster $NODENAME0=http://$IP0:2380,$NODENAME1=http://$IP1:2380,$NODENAME2=http://$IP2:2380,$NODENAME3=http://$IP3:2380 \ 
--initial-cluster-state new 

劇本我得到

2016-11-11 22:13:12.090515 I | etcdmain: etcd Version: 2.3.7 
2016-11-11 22:13:12.090643 N | etcdmain: the server is already initialized as member before, starting as etcd member... 
2016-11-11 22:13:12.090713 I | etcdmain: listening for peers on http://10.150.0.3:2380 
2016-11-11 22:13:12.090745 I | etcdmain: listening for client requests on http://10.150.0.3:2379 
2016-11-11 22:13:12.090771 I | etcdmain: listening for client requests on http://127.0.0.1:2379 
2016-11-11 22:13:12.090960 I | etcdserver: name = node2 
2016-11-11 22:13:12.090976 I | etcdserver: data dir = /root/etcd-data 
2016-11-11 22:13:12.090983 I | etcdserver: member dir = /root/etcd-data/member 
2016-11-11 22:13:12.090990 I | etcdserver: heartbeat = 100ms 
2016-11-11 22:13:12.090995 I | etcdserver: election = 1000ms 
2016-11-11 22:13:12.091001 I | etcdserver: snapshot count = 10000 
2016-11-11 22:13:12.091011 I | etcdserver: advertise client URLs = http://10.150.0.3:2379 
2016-11-11 22:13:12.091269 I | etcdserver: restarting member 7fbd572038b372f6 in cluster 4e73d7b9b94fe83b at commit index 4 
2016-11-11 22:13:12.091317 I | raft: 7fbd572038b372f6 became follower at term 8 
2016-11-11 22:13:12.091346 I | raft: newRaft 7fbd572038b372f6 [peers: [], term: 8, commit: 4, applied: 0, lastindex: 4, lastterm: 1] 
2016-11-11 22:13:12.091516 I | etcdserver: starting server... [version: 2.3.7, cluster version: to_be_decided] 
2016-11-11 22:13:12.091869 E | etcdmain: failed to notify systemd for readiness: No socket 
2016-11-11 22:13:12.091894 E | etcdmain: forgot to set Type=notify in systemd service file? 
2016-11-11 22:13:12.096380 N | etcdserver: added member 7508b3e625cfed5 [http://10.150.0.4:2380] to cluster 4e73d7b9b94fe83b 
2016-11-11 22:13:12.099800 N | etcdserver: added member 14c76eb5d27acbc5 [http://10.150.0.1:2380] to cluster 4e73d7b9b94fe83b 
2016-11-11 22:13:12.100957 N | etcdserver: added local member 7fbd572038b372f6 [http://10.150.0.2:2380] to cluster 4e73d7b9b94fe83b 
2016-11-11 22:13:12.102711 N | etcdserver: added member d416fca114f17871 [http://10.150.0.3:2380] to cluster 4e73d7b9b94fe83b 
2016-11-11 22:13:12.134330 E | rafthttp: request cluster ID mismatch (got cfd5ef74b3dcf6fe want 4e73d7b9b94fe83b) 

給其他成員,甚至沒有運行,如何這是可能的?

謝謝

回答

1

所有這些誰從谷歌在中國原創:

誤差在對等成員ID,即試圖加入同名集羣的另一個成員(可能是舊的實例)已經存在於羣集中(具有相同的對等名稱,但具有另一個ID,這是問題)。

你應該刪除等,它喜歡在這個有用的帖子所示重新添加:https://ngineered.co.uk/blog/how-to-replace-a-etcd-node

使用「etcdctl成員列表」中找到什麼是當前成員的ID,並找到其試圖加入一個具有錯誤ID的羣集,然後使用「etcdctl member remove」從「成員」中刪除該對等方並嘗試重新加入。 希望它有幫助。

0

我的--data-dir =/var/etcd/data,刪除並重新創建它,這對我很有用。看來以前的etcd集羣中的某些東西留在了這個目錄中,這可能會影響到etcd設置。

0

我都面臨着同樣的問題,我們的領袖ETCD臺服務器宕機,並用新的替換它後,我們得到一個錯誤

rafthttp: request sent was ignored (cluster ID mismatch) 

有人找老cluster-id和產生一些隨機的本地集羣有一些配置錯誤。

按照以下步驟解決問題。如果ETCD進程正在運行systemctl etcd2 stop

  • 從刪除數據

    1. 登錄到其他工作組,並從 撤銷不可達成員集羣

      etcdctl cluster-health etcdctl member remove member-id

    2. 登錄到新服務器,並停止數據目錄rm -rf /var/etcd2/data在刪除之前,請將此數據備份到其他文件夾的某處。

    3. 現在使用--initial-cluster-state existing參數啓動羣集,如果您已將服務器添加到現有羣集,請不要使用--initial-cluster-state new

    4. 現在返回到正在運行的ETCD服務器之一,並添加這個新成員集羣etcdctl member add node0 http://$IP:2380

    我花了很多時間調試這個問題,現在我的集羣正在運行的所有健康成員。希望這些信息有幫助。

  • 0
    在我的情況

    我得到了錯誤

    rafthttp:請求集羣ID不匹配(有1b3a88599e79f82b想b33939d80a381a57)

    由於不正確的配置的一個節點

    我的兩個節點中的配置

    抓住

    env ETCD_INITIAL_CLUSTER =「etcd-01 = http://172.16.50.101:2380,etcd-02=http://172.16.50.102:2380,etcd-03=http://172.16.50.103:2380

    and one node got

    ENV ETCD_INITIAL_CLUSTER = 「ETCD-01 = http://172.16.50.101:2380

    解決我停止ETCD所有節點上的問題,編輯不正確的配置, 刪除的/ var/lib中/ ETCD /成員中的所有節點的文件夾,重新啓動ETCD在所有節點上,瞧!

    p.s.

    /var/lib/etcd - etcd保存其數據的文件夾