無法在 glusterfs 客戶端上掛載 gluster fs:傳輸端點未連接
更新:升級到最新版本 5.2,從而更新日誌。但是,問題保持不變 更新 2:也將客戶端更新到 5.2,仍然是相同的問題。
我有一個具有 3 個節點的 gluster 集群設置。
- 伺服器1,192.168.100.1
- 伺服器2,192.168.100.2
- 伺服器3,192.168.100.3
它們通過內部網路 192.160.100.0/24 連接。但是,我想使用其中一台伺服器的公共 ip從網路外部連接客戶端,但這不起作用:
sudo mount -t glusterfs x.x.x.x:/datavol /mnt/gluster/
在日誌中給出類似的內容:
[2018-12-15 17:57:29.666819] I [fuse-bridge.c:4153:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.26 [2018-12-15 18:23:47.892343] I [fuse-bridge.c:4259:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.26 [2018-12-15 18:23:47.892375] I [fuse-bridge.c:4870:fuse_graph_sync] 0-fuse: switched to graph 0 [2018-12-15 18:23:47.892475] I [MSGID: 108006] [afr-common.c:5650:afr_local_init] 0-datavol-replicate-0: no subvolumes up [2018-12-15 18:23:47.892533] E [fuse-bridge.c:4328:fuse_first_lookup] 0-fuse: first lookup on root failed (Transport endpoint is not connected) [2018-12-15 18:23:47.892651] W [fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2018-12-15 18:23:47.892668] W [fuse-bridge.c:3250:fuse_statfs_resume] 0-glusterfs-fuse: 2: STATFS (00000000-0000-0000-0000-000000000001) resolution fail [2018-12-15 18:23:47.892773] W [fuse-bridge.c:889:fuse_attr_cbk] 0-glusterfs-fuse: 3: LOOKUP() / => -1 (Transport endpoint is not connected) [2018-12-15 18:23:47.894204] W [fuse-bridge.c:889:fuse_attr_cbk] 0-glusterfs-fuse: 4: LOOKUP() / => -1 (Transport endpoint is not connected) [2018-12-15 18:23:47.894367] W [fuse-bridge.c:889:fuse_attr_cbk] 0-glusterfs-fuse: 5: LOOKUP() / => -1 (Transport endpoint is not connected) [2018-12-15 18:23:47.916333] I [fuse-bridge.c:5134:fuse_thread_proc] 0-fuse: initating unmount of /mnt/gluster The message "I [MSGID: 108006] [afr-common.c:5650:afr_local_init] 0-datavol-replicate-0: no subvolumes up" repeated 4 times between [2018-12-15 18:23:47.892475] and [2018-12-15 18:23:47.894347] [2018-12-15 18:23:47.916555] W [glusterfsd.c:1481:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494) [0x7f90f2306494] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xfd) [0x5591a51e87ed] -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x5591a51e8644] ) 0-: received signum (15), shutting down [2018-12-15 18:23:47.916573] I [fuse-bridge.c:5897:fini] 0-fuse: Unmounting '/mnt/gluster'. [2018-12-15 18:23:47.916582] I [fuse-bridge.c:5902:fini] 0-fuse: Closing fuse connection to '/mnt/gluster'.
我能看到的是
0-datavol-replicate-0: no subvolumes up
和
0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected)
防火牆埠(24007-24008、49152-49156)在公共網路介面上打開。
gluster 卷修復數據卷資訊
Brick 192.168.100.1:/data/gluster/brick1 Status: Connected Number of entries: 0 Brick 192.168.100.2:/data/gluster/brick1 Status: Connected Number of entries: 0 Brick 192.168.100.3:/data/gluster/brick1 Status: Connected Number of entries: 0
集群資訊:
1: volume datavol-client-0 2: type protocol/client 3: option ping-timeout 42 4: option remote-host 192.168.100.1 5: option remote-subvolume /data/gluster/brick1 6: option transport-type socket 7: option transport.address-family inet 8: option send-gids true 9: end-volume 10: 11: volume datavol-client-1 12: type protocol/client 13: option ping-timeout 42 14: option remote-host 192.168.100.2 15: option remote-subvolume /data/gluster/brick1 16: option transport-type socket 17: option transport.address-family inet 18: option send-gids true 19: end-volume 20: 21: volume datavol-client-2 22: type protocol/client 23: option ping-timeout 42 24: option remote-host 192.168.100.3 25: option remote-subvolume /data/gluster/brick1 26: option transport-type socket 27: option transport.address-family inet 28: option send-gids true 29: end-volume 30: 31: volume datavol-replicate-0 32: type cluster/replicate 33: subvolumes datavol-client-0 datavol-client-1 datavol-client-2 34: end-volume 35: 36: volume datavol-dht 37: type cluster/distribute 38: option lock-migration off 39: subvolumes datavol-replicate-0 40: end-volume 41: 42: volume datavol-write-behind 43: type performance/write-behind 44: subvolumes datavol-dht 45: end-volume 46: 47: volume datavol-read-ahead 48: type performance/read-ahead 49: subvolumes datavol-write-behind 50: end-volume 51: 52: volume datavol-readdir-ahead 53: type performance/readdir-ahead 54: subvolumes datavol-read-ahead 55: end-volume 56: 57: volume datavol-io-cache 58: type performance/io-cache 59: subvolumes datavol-readdir-ahead 60: end-volume 61: 62: volume datavol-quick-read 63: type performance/quick-read 64: subvolumes datavol-io-cache 65: end-volume 66: 67: volume datavol-open-behind 68: type performance/open-behind 69: subvolumes datavol-quick-read 70: end-volume 71: 72: volume datavol-md-cache 73: type performance/md-cache 74: subvolumes datavol-open-behind 75: end-volume 76: 77: volume datavol 78: type debug/io-stats 79: option log-level INFO 80: option latency-measurement off 81: option count-fop-hits off 82: subvolumes datavol-md-cache 83: end-volume 84: 85: volume meta-autoload 86: type meta 87: subvolumes datavol 88: end-volume
gluster 對等狀態:
root@server1 /data # gluster peer status Number of Peers: 2 Hostname: 192.168.100.2 Uuid: 0cb2383e-906d-4ca6-97ed-291b04b4fd10 State: Peer in Cluster (Connected) Hostname: 192.168.100.3 Uuid: d2d9e82f-2fb6-4f27-8fd0-08aaa8409fa9 State: Peer in Cluster (Connected)
gluster 卷狀態
Status of volume: datavol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.100.1:/data/gluster/brick1 49152 0 Y 13519 Brick 192.168.100.2:/data/gluster/brick1 49152 0 Y 30943 Brick 192.168.100.3:/data/gluster/brick1 49152 0 Y 24616 Self-heal Daemon on localhost N/A N/A Y 3282 Self-heal Daemon on 192.168.100.2 N/A N/A Y 18987 Self-heal Daemon on 192.168.100.3 N/A N/A Y 24638 Task Status of Volume datavol
我想念什麼?
我也有同樣的問題。
你有沒有看到?https://bugzilla.redhat.com/show_bug.cgi?id=1659824
在 GlusterFS 中使用“IP”似乎不是“好”,因為客戶端依賴於來自伺服器的捲資訊中的遠端主機地址。如果伺服器無法到達足夠多的 Gluster 節點,則無法使用其他節點的捲資訊。請參閱https://unix.stackexchange.com/questions/213705/glusterfs-how-to-failover-smartly-if-a-mounted-server-is-failed
所以 - 問題是 - 掛載點到達 node1,讀取卷資訊(請參閱 參考資料
/var/log/glusterfs/<volume>.log
)。有關於其他節點的資訊option remote-host
)。然後,客戶端嘗試連接到私有 IP 上的那個節點——但失敗了(在我的例子中)。我假設,您的公共客戶端無法訪問私有 IP - 這就是背後的問題Transport endpoint is not connected
。解決方案 A - 在 Gluster 集群中使用主機名而不是 IP 將起作用,因為您可以
/etc/hosts
為所有伺服器創建別名。但這意味著 - 必須重建 Gluster 以使用 DNS 名稱(即 Gluster 節點內的 192-IP 和客戶端上的公共 IP)。我沒有嘗試從基於 IP 的 Gluster 切換到基於 DNS 的 Gluster(尤其是在生產環境中?)。我不清楚 RH bugzilla 中的解決方案 B。我不明白,應該包含什麼
glusterfs -f$local-volfile $mountpoint
- 特別是要忽略的真正掛載點選項是什麼remote-host
以及它們對 vol-file 意味著什麼。SE上的第二個文章中有回應。我想,這就是答案,但我還沒有測試過。所以 - 我認為,這不是一個錯誤,而是一個文件差距。用於建構卷的資訊(塊主機名)在客戶端內部用於連接到其他節點,然後在安裝點選項中指定。