Glusterfs

GlusterFS:當一個節點關閉時,第二個不會接管

  • June 20, 2020

我剛剛在兩個節點上安裝了 GlusterFS,它看起來工作正常,狀態為 up,並且複制工作正常。

但是當我停止 node1 卷 cluster volume stop glustervol1` 或關閉伺服器以模擬故障時,第二個節點不會接管。

我按照這些說明在此處安裝 GlusterFS

已編輯

我正在使用帶有 7 Vm 和 CentOS 7 的 VMware。

App-Master:LampStack,Mysql-Master 中的 Remote DB,Gluster 1 中的儲存。

gluster1:/glustervol1       /home/wordpress/  glusterfs   defaults,_netdev  0  0

App-Slave:Lampstack,Mysql-Slave 中的 Remote DB,Gluster 2 中的儲存。

gluster2:/glustervol1       /home/wordpress/  glusterfs   defaults,_netdev  0  0

Mysql-Master:Mysql.

Mysql-Slave:Mysql(複製)。

Gluster 1:創建卷的 GlusterFS 伺服器。

Gluster 2:GlusterFS 是 Gluster 1 的副本。

LoadBalancer : App-master 和 App-Slave 之間的 Nginx 負載平衡

資訊

172.16.172.147 gluster1
172.16.172.148 gluster2
172.16.172.146 appslave
172.16.172.143 appmaster

來自 App-Master 伺服器的日誌

關閉 Gluster1 後

[2016-11-21 16:36:18.532124] W [socket.c:642:__socket_rwv] 0-glusterfs: readv on 172.16.172.147:24007 failed (Connection timed out)
[2016-11-21 16:36:18.532125] W [socket.c:642:__socket_rwv] 0-glustervol1-client-0: readv on 172.16.172.147:49152 failed (Connection timed out)
[2016-11-21 16:36:18.532323] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-glustervol1-client-0: disconnected from glustervol1-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2016-11-21 16:36:31.965015] E [socket.c:2332:socket_connect_finish] 0-glusterfs: connection to 172.16.172.147:24007 failed (No route to host)
[2016-11-21 16:36:31.965141] E [socket.c:2332:socket_connect_finish] 0-glustervol1-client-0: connection to 172.16.172.147:24007 failed (No route to host)

開啟 Gluster1 後

[2016-11-21 16:39:02.258175] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2016-11-21 16:39:02.258595] I [rpc-clnt.c:1851:rpc_clnt_reconfig] 0-glustervol1-client-0: changing port to 49152 (from 0)
[2016-11-21 16:39:02.262348] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-glustervol1-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-11-21 16:39:02.299637] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-glustervol1-client-0: Connected to glustervol1-client-0, attached to remote volume '/bricks/brick1/brick'.
[2016-11-21 16:39:02.299714] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-glustervol1-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2016-11-21 16:39:02.300513] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-glustervol1-client-0: Server lk version = 1

關閉 Gluster2 後

[2016-11-21 16:41:33.394122] C [rpc-clnt-ping.c:161:rpc_clnt_ping_timer_expired] 0-glustervol1-client-1: server 172.16.172.148:49152 has not responded in the last 42 seconds, disconnecting.
[2016-11-21 16:41:33.394943] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7fd4ad63c906] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fd4ad40792e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd4ad407a3e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fd4ad4093fc] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fd4ad409c08] ))))) 0-glustervol1-client-1: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-11-21 16:40:50.706048 (xid=0x59e)
[2016-11-21 16:41:33.394973] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-glustervol1-client-1: remote operation failed. Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint is not connected]
[2016-11-21 16:41:33.395188] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7fd4ad63c906] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fd4ad40792e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd4ad407a3e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7fd4ad4093fc] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fd4ad409c08] ))))) 0-glustervol1-client-1: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2016-11-21 16:40:50.706053 (xid=0x59f)
[2016-11-21 16:41:33.395201] W [rpc-clnt-ping.c:204:rpc_clnt_ping_cbk] 0-glustervol1-client-1: socket disconnected
[2016-11-21 16:41:33.395211] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-glustervol1-client-1: disconnected from glustervol1-client-1. Client process will keep trying to connect to glusterd until brick's port is available

開啟 Gluster2 後

[2016-11-21 16:41:45.255081] E [socket.c:2332:socket_connect_finish] 0-glustervol1-client-1: connection to 172.16.172.148:24007 failed (No route to host)

注意:即使 Gluster1 或 Gluster2 關閉,故障轉移仍然有效,應用程序繼續工作。

這並不是 GlusterFS 的工作方式,因為它不會“故障轉移”。沒有單個節點充當主節點,因為 GlusterFS 是無主節點。客戶端負責連接到所有 Gluster 對等點,並且客戶端必須保持對伺服器的網路可見性,以保持健全的捲活動。節點間通信通常僅用於卷修復操作和協商對等信任。

如果您遵循該指南,我假設您已經從兩塊磚創建了一個副本 2 gluster 卷。您的配置是什麼樣的,將這些 Gluster 節點連接在一起的整個網路是什麼樣的,以及將它們連接到客戶端的是什麼樣的?Gluster 的大多數問題都與客戶端到伺服器的通信問題有關。

引用自:https://serverfault.com/questions/815562