Linux

執行 keepalived 的兩台伺服器都成為主伺服器並具有相同的虛擬 IP

  • December 3, 2021

兩台伺服器都啟動了keepalived,BACKUP伺服器立即轉換為MASTER STATE。現在兩人都成為了大師。

兩個節點都在發送 VRRP 通告消息。

在主伺服器上:

[root@zhsq1 ~]# tcpdump -c 3 -i em1 host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
11:01:35.526355 IP zhsq1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 153, authtype simple, intvl 1s, length 20
11:01:36.526497 IP zhsq1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 153, authtype simple, intvl 1s, length 20
11:01:37.527561 IP zhsq1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 153, authtype simple, intvl 1s, length 20

在備份伺服器上:

[root@zhsq2 ~]# tcpdump -c 3 -i em1 host 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
11:11:04.314996 IP zhsq2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20
11:11:05.315111 IP zhsq2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20
11:11:06.316175 IP zhsq2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 102, authtype simple, intvl 1s, length 20

以下是主伺服器日誌:

May 31 11:00:22 zhsq1 Keepalived[31475]: Starting Keepalived v1.2.7 (05/20,2013)
May 31 11:00:22 zhsq1 Keepalived[31476]: Starting Healthcheck child process, pid=31477
May 31 11:00:22 zhsq1 Keepalived[31476]: Starting VRRP child process, pid=31478
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Interface queue is empty
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: No such interface, em2
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Netlink reflector reports IP 10.0.7.60 added
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Netlink reflector reports IP fe80::92b1:1cff:fe4c:bea8 added
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Registering Kernel netlink reflector
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Registering Kernel netlink command channel
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Interface queue is empty
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: No such interface, em2
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Netlink reflector reports IP 10.0.7.60 added
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Netlink reflector reports IP fe80::92b1:1cff:fe4c:bea8 added
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Registering Kernel netlink reflector
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Registering Kernel netlink command channel
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Registering gratuitous ARP shared channel
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Opening file '/etc/keepalived/keepalived.conf'.
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Configuration is using : 4661 Bytes
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Opening file '/etc/keepalived/keepalived.conf'.
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Configuration is using : 63856 Bytes
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: Using LinkWatch kernel netlink reflector...
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: VRRP sockpool: [ifindex(2), proto(112), fd(11,12)]
May 31 11:00:22 zhsq1 Keepalived_healthcheckers[31477]: Using LinkWatch kernel netlink reflector...
May 31 11:00:22 zhsq1 Keepalived_vrrp[31478]: VRRP_Script(chk_http_port) succeeded
May 31 11:00:23 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) Transition to MASTER STATE
May 31 11:00:24 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) Entering MASTER STATE
May 31 11:00:24 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) setting protocol VIPs.
May 31 11:00:24 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) Sending gratuitous ARPs on em1 for 10.0.7.65
May 31 11:00:24 zhsq1 Keepalived_healthcheckers[31477]: Netlink reflector reports IP 10.0.7.65 added
May 31 11:00:29 zhsq1 Keepalived_vrrp[31478]: VRRP_Instance(VI_1) Sending gratuitous ARPs on em1 for 10.0.7.65

以下是備份伺服器日誌:

May 31 11:01:50 zhsq2 Keepalived[31250]: Starting Keepalived v1.2.7 (05/20,2013)
May 31 11:01:50 zhsq2 Keepalived[31251]: Starting Healthcheck child process, pid=31252
May 31 11:01:50 zhsq2 Keepalived[31251]: Starting VRRP child process, pid=31253
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Interface queue is empty
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: No such interface, em2
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Netlink reflector reports IP 10.0.7.61 added
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Netlink reflector reports IP fe80::92b1:1cff:fe4c:b8b7 added
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Registering Kernel netlink reflector
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Registering Kernel netlink command channel
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Interface queue is empty
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: No such interface, em2
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Netlink reflector reports IP 10.0.7.61 added
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Netlink reflector reports IP fe80::92b1:1cff:fe4c:b8b7 added
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Registering Kernel netlink reflector
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Registering Kernel netlink command channel
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Registering gratuitous ARP shared channel
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Opening file '/etc/keepalived/keepalived.conf'.
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Configuration is using : 4661 Bytes
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Opening file '/etc/keepalived/keepalived.conf'.
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Configuration is using : 63856 Bytes
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: Using LinkWatch kernel netlink reflector...
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Entering BACKUP STATE
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: VRRP sockpool: [ifindex(2), proto(112), fd(11,12)]
May 31 11:01:50 zhsq2 Keepalived_healthcheckers[31252]: Using LinkWatch kernel netlink reflector...
May 31 11:01:50 zhsq2 Keepalived_vrrp[31253]: VRRP_Script(chk_http_port) succeeded
May 31 11:01:54 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Transition to MASTER STATE
May 31 11:01:55 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Entering MASTER STATE
May 31 11:01:55 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) setting protocol VIPs.
May 31 11:01:55 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Sending gratuitous ARPs on em1 for 10.0.7.65
May 31 11:01:55 zhsq2 Keepalived_healthcheckers[31252]: Netlink reflector reports IP 10.0.7.65 added
May 31 11:02:00 zhsq2 Keepalived_vrrp[31253]: VRRP_Instance(VI_1) Sending gratuitous ARPs on em1 for 10.0.7.65

主伺服器的keepalived conf如下:

vrrp_script chk_http_port {         
       script "/opt/nginx/nginx_pid.sh"
       interval 2 
       weight 2  
}

vrrp_instance VI_1 {  
       state MASTER
       #nopreempt 
       interface em1
       virtual_router_id 51 
       priority 151
       mcast_src_ip 10.0.7.60 
       track_interface {
               em1
       }
       authentication {  
               auth_type PASS 
               auth_pass 1111 
       }  
       track_script {  
               chk_http_port 
       }  
       virtual_ipaddress {                  
               10.0.7.65 dev em1
       }  
} 

備份伺服器的keepalived conf如下:

vrrp_script chk_http_port {
       script "/opt/nginx/nginx_pid.sh"
       interval 2
       weight 2
}

vrrp_instance VI_1 {
       state BACKUP
       interface em1
       virtual_router_id 51
       priority 100
       mcast_src_ip 10.0.7.61
       track_interface {
               em1
       }
       authentication {
               auth_type PASS
               auth_pass 1111
       }
       track_script {
               chk_http_port
       }
       virtual_ipaddress {
               10.0.7.65 dev em1
       }
}

chk_http_port 文件如下:

NGINX_PROCESS=`ps -C nginx --no-header | wc -l`

if [ $NGINX_PROCESS -eq 0 ]; then

       /usr/local/nginx/sbin/nginx

       sleep 3

       if [ `ps -C nginx --no-header | wc -l` -eq 0 ]; then

               killall keepalived 

       fi  

fi

請幫我。

非常感謝。

數據包不會在 em1 介面上的機器之間傳遞(導致如Mike 所說的腦裂場景)。

  • 檢查您的防火牆以確保沒有擷取數據包
  • 檢查您的網路以確保兩台機器上的 em1 是同一個網路

這是其中一個數據包的外觀範例:

Frame 2: 54 bytes on wire (432 bits), 54 bytes captured (432 bits)
   Arrival Time: Jun  1, 2013 03:39:50.709520000 UTC
   Epoch Time: 1370057990.709520000 seconds
   [Time delta from previous captured frame: 0.000970000 seconds]
   [Time delta from previous displayed frame: 0.000970000 seconds]
   [Time since reference or first frame: 0.000970000 seconds]
   Frame Number: 2
   Frame Length: 54 bytes (432 bits)
   Capture Length: 54 bytes (432 bits)
   [Frame is marked: False]
   [Frame is ignored: False]
   [Protocols in frame: eth:ip:vrrp]
Ethernet II, Src: 00:25:90:83:b0:07 (00:25:90:83:b0:07), Dst: 01:00:5e:00:00:12 (01:00:5e:00:00:12)
   Destination: 01:00:5e:00:00:12 (01:00:5e:00:00:12)
       Address: 01:00:5e:00:00:12 (01:00:5e:00:00:12)
       .... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)
       .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
   Source: 00:25:90:83:b0:07 (00:25:90:83:b0:07)
       Address: 00:25:90:83:b0:07 (00:25:90:83:b0:07)
       .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
       .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
   Type: IP (0x0800)
Internet Protocol Version 4, Src: 10.0.10.11 (10.0.10.11), Dst: 224.0.0.18 (224.0.0.18)
   Version: 4
   Header length: 20 bytes
   Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
       0000 00.. = Differentiated Services Codepoint: Default (0x00)
       .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
   Total Length: 40
   Identification: 0x8711 (34577)
   Flags: 0x00
       0... .... = Reserved bit: Not set
       .0.. .... = Don't fragment: Not set
       ..0. .... = More fragments: Not set
   Fragment offset: 0
   Time to live: 255
   Protocol: VRRP (112)
   Header checksum: 0x4037 [correct]
       [Good: True]
       [Bad: False]
   Source: 10.0.10.11 (10.0.10.11)
   Destination: 224.0.0.18 (224.0.0.18)
Virtual Router Redundancy Protocol
   Version 2, Packet type 1 (Advertisement)
       0010 .... = VRRP protocol version: 2
       .... 0001 = VRRP packet type: Advertisement (1)
   Virtual Rtr ID: 254
   Priority: 151 (Non-default backup priority)
   Addr Count: 1
   Auth Type: No Authentication (0)
   Adver Int: 1
   Checksum: 0x3c01 [correct]
   IP Address: 10.0.0.254 (10.0.0.254)

對於我的情況,我必須允許多播流量通過防火牆224.0.0.18,對於 ufw:

ufw allow from 224.0.0.18
ufw allow to 224.0.0.18

對我有幫助。

引用自:https://serverfault.com/questions/512153