Linux

當 Firewalld 執行時,Keepalived 腦裂

  • March 28, 2022

我正在使用 keepalived 在兩個 Alma 8 Nginx 伺服器之間提供可用性(如果有任何相關性,則託管在 VMWare 上)。啟用 firewalld 後,儘管為 VRRP 設置了豐富的規則,但當我啟動 firewalld 時,兩台主機都開始響應虛擬 IP:

root@dca-nfs01:~# arping 172.31.5.233
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=39 time=1.960 usec
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=40 time=20.660 usec
60 bytes from 00:50:56:84:52:ed (172.31.5.233): index=41 time=24.930 usec
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=42 time=534.616 msec
60 bytes from 00:50:56:84:52:ed (172.31.5.233): index=43 time=534.646 msec

我的 keepalived 配置取自標準教程模板,如下所示:

[root@dca-ngx01-al ~]# cat /etc/keepalived/keepalived.conf
global_defs {
 # Keepalived process identifier
 router_id nginx
}

# Script to check whether Nginx is running or not
vrrp_script check_nginx {
 script "/sbin/pidof nginx"
 interval 2
 weight 50
}

# Virtual interface - The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
 state MASTER
 interface ens192
 virtual_router_id 151
 priority 110

 # The virtual ip address shared between the two NGINX Web Server which will float
 virtual_ipaddress {
   172.31.5.233
 }
 track_script {
   check_nginx
 }
 authentication {
   auth_type AH
   auth_pass secret
 }
}

兩個盒子都有一個簡單的單區防火牆,我添加了一個豐富的規則來允許兩台主機之間的 VRRP 通信:

[root@dca-ngx01-al ~]# firewall-cmd --list-all
public (active)
 target: default
 icmp-block-inversion: no
 interfaces: ens192
 sources:
 services: dhcpv6-client http https ssh
 ports: 10050/tcp
 protocols:
 forward: no
 masquerade: no
 forward-ports:
 source-ports:
 icmp-blocks:
 rich rules:
       rule protocol value="vrrp" accept

我也net.ipv4.ip_forward = 1入了/etc/sysctl.conf

當firewalld在兩個盒子上都停止時,keepalived行為正確,但是當啟用時,雙方似乎失去了聯繫,只是重複發送免費的ARP數據包:

● keepalived.service - LVS and VRRP High Availability Monitor
  Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
  Active: active (running) since Fri 2022-03-25 12:48:25 GMT; 2h 35min ago
 Process: 7140 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
 Process: 12966 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 12967 (keepalived)
   Tasks: 2 (limit: 11406)
  Memory: 1.8M
  CGroup: /system.slice/keepalived.service
          ├─12967 /usr/sbin/keepalived -D
          └─12968 /usr/sbin/keepalived -D

Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: (VI_01) Sending/queueing gratuitous ARPs on ens192 for 1>
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233

然而,我可以通過使用 TCPDump 看到,當 firewalld 處於活動狀態時,來自其他主機的正常 VRRP 數據包至少會到達網路介面:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
15:25:21.532300 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3160): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:22.532419 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3161): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:23.532476 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3162): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:24.532544 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3163): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20

有人對我如何進一步解決此問題有任何想法嗎?

提前致謝。

今天早上我已經弄清楚了問題的原因是什麼,以防將來對某人有所幫助。我啟用LogDenied=all/etc/firewalld/firewalld.conf,然後能夠使用交換機辨識防火牆仍在丟棄哪些數據包--get-log-denied

[root@dca-ngx02-al keepalived]# firewall-cmd --get-log-denied
Mar 28 08:40:04 dca-ngx01-al.REDACTED.local kernel: FINAL_REJECT: IN=ens192 OUT= MAC=01:00:5e:00:00:12:00:50:56:84:ac:d0:08:00 SRC=172.31.5.229 DST=224.0.0.18 LEN=64 TOS=0x00 PREC=0xC0 TTL=255 ID=79 PROTO=AH SPI=0xac1f05e5
Mar 28 08:40:05 dca-ngx01-al.REDACTED.local kernel: FINAL_REJECT: IN=ens192 OUT= MAC=01:00:5e:00:00:12:00:50:56:84:ac:d0:08:00 SRC=172.31.5.229 DST=224.0.0.18 LEN=64 TOS=0x00 PREC=0xC0 TTL=255 ID=80 PROTO=AH SPI=0xac1f05e5

我通過為 AH 多播數據包添加後續防火牆規則解決了這個問題。

firewall-cmd --add-rich-rule='rule protocol value="ah" accept' --permanent

引用自:https://serverfault.com/questions/1097034