關於郵件伺服器和imap ha雙活集群
我已經設置了一個郵件伺服器,用於測試。我的目標是擁有一個帶有 imaps 的 HA 郵件伺服器,當客戶端連接到虛擬 ip 時,它會重定向到兩個真實伺服器,如果一個真實伺服器崩潰,另一個真實伺服器會“獲取”連接。我已經設置了一個集群,其中包含兩個 keepalived/haproxy lb 和兩個帶有 postfix 和 Dovecot 的真實伺服器。兩個 lb 是 Debian,郵件伺服器是 Fedora 31。這是我在兩個 lb(負載平衡器)上的配置
Keepalived.conf
global_defs { } vrrp_instance VI_1 { interface nm-team state MASTER virtual_router_id 51 priority 101 # 101 on master, 100 on backup advert_int 1 smtp_alert authentication { auth_type PASS auth_pass mypass } } virtual_ipaddress { 10.2.0.4/24 brd 10.2.0.255 dev nm-team } virtual_server 10.2.0.4 25 { delay_loop 30 lb_algo rr lb_kind DR protocol TCP persistence_timeout 360 real_server 10.2.0.5 25 { weight 1 TCP_CHECK { connect_timeout 10 connect_port 25 delay_before_retry 3 } } real_server 10.2.0.6 25 { weight 1 TCP_CHECK { connect_timeout 10 connect_port 25 delay_before_retry 3 } } } virtual_server 10.2.0.4 993 { delay_loop 30 lb_algo rr lb_kind DR protocol TCP persistence_timeout 360 real_server 10.2.0.5 993 { weight 1 TCP_CHECK { connect_timeout 10 connect_port 993 nb_get_retry 3 delay_before_retry 3 } } real_server 10.2.0.6 993 { weight 1 TCP_CHECK { connect_timeout 10 connect_port 993 nb_get_retry 3 delay_before_retry 3 } } }
haproxy.cfg
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners stats timeout 30s user haproxy group haproxy daemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # Default ciphers to use on SSL-enabled listening sockets. # For more information, see ciphers(1SSL). This list is from: # https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/ # An alternative list with additional directives can be obtained from # https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS ssl-default-bind-options no-sslv3 defaults log global mode tcp #postfix listen smtp bind mail.mydomain.priv:25 balance roundrobin timeout client 30s timeout connect 10s timeout server 1m no option http-server-close mode tcp option smtpchk option tcplog server mail1 mail1.mydomain.priv:25 send-proxy server mail2 mail2.mydomain.priv:25 send-proxy #dovecot listen imap bind mail.mydomain.priv:993 timeout client 30s timeout connect 10s timeout server 1m no option http-server-close balance leastconn stick store-request src stick-table type ip size 200k expire 30m mode tcp option tcplog server mail1 mail1.mydomain.priv:993 send-proxy server mail2 mail2.mydomain.priv:993 send-proxy
如您所見,mail.domain.priv 是綁定到虛擬 ip 10.2.0.4(由 keepalived 創建)的“虛擬”伺服器,真實伺服器是 10.2.0.5 和 10.2.0.6。虛擬 ip 10.2.0.4 是 lo 介面的別名,我用這些行在 lb 中創建了它
ip addr add 10.2.0.4/32 dev lo label lo:0
在真實伺服器中
echo 1 >/proc/sys/net/ipv4/conf/all/arp_ignore echo 2 >/proc/sys/net/ipv4/conf/all/arp_announce ip addr add 10.2.0.4/32 dev lo label lo:0
由於太長,我跳過發布 dovecot/postfix 配置,但我已經對其進行了測試並且工作正常,作為單個伺服器並使用 10.2.0.4 虛擬 IP。當然,真正的伺服器使用 glusterfs 共享 /var/vmail/mydomain(我知道這很慢,但僅用於測試)。我已經連接了一個客戶端,我可以使用 dovecot 接收電子郵件,並使用 imaps 和帶有 starttls 的 smtp 發送帶有 postfix 的電子郵件,沒有任何問題。那麼,問題是什麼?我已經測試了集群關閉了一個客戶端打開(Thunderbird)的真實伺服器,並且客戶端“凍結”,因為集群不存在並且無法讀取電子郵件。如果我殺死客戶端或重新啟動它,它會毫無問題地重新連接到 10.2.0.4 虛擬 ip(mail.mydomain.priv)。怎麼了?是否可以使用 keepalived 和 haproxy 創建一個 ha 集群活動/活動?
多虧了 unix 論壇的幫助,找到了解決方案:從 lo:0 中刪除了 virtual-ip 並僅在 haproxy/keepalived 伺服器上創建了一個 nm-team:0 別名。
然後我編輯 haproxy.cfg
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners stats timeout 30s user haproxy group haproxy daemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # Default ciphers to use on SSL-enabled listening sockets. # For more information, see ciphers(1SSL). This list is from: # https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/ # An alternative list with additional directives can be obtained from # https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS ssl-default-bind-options no-sslv3 defaults log global mode tcp option dontlognull option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 3000 frontend mail-in bind mail.mydomain.priv:25 mode tcp option tcplog default_backend mail-in-back backend mail-in-back balance roundrobin server mail1.mydomain.priv mail1.mydomain.priv:25 check server mail2.mydomain.priv mail2.mydomain.priv:25 check frontend imaps-in bind mail.mydomain.priv:993 mode tcp option tcplog default_backend imaps-in-back backend imaps-in-back balance roundrobin server mail1.mydomain.priv mail1.mydomain.priv:993 check server mail2.mydomain.priv mail2.mydomain.priv:993 check
然後我編輯keepalived.conf
vrrp_script chk_haproxy { script "killall -0 haproxy" # check the haproxy process interval 2 # every 2 seconds weight 2 # add 2 points if OK } vrrp_instance VI_1 { interface nm-team # interface to monitor state MASTER # MASTER on haproxy1, BACKUP on haproxy2 virtual_router_id 51 priority 100 # 100 on haproxy1, 101 on haproxy2 advert_int 1 smtp_alert authentication { auth_type PASS auth_pass yourpass } virtual_ipaddress { 10.2.0.4 # virtual ip address } track_script { chk_haproxy } }
然後我在haproxy2上複製keepalived.conf並調整一些聲音(MASTER變成BACKUP,id 100變成101)。在 haproxy 伺服器上,我為 sysctl 保留此配置
net.ipv4.tcp_syncookies=1 net.ipv4.ip_forward=1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.team0.send_redirects = 0 net.ipv4.conf.nm-team.send_redirects = 0
在重新啟動keepalived和haproxy後一切正常,我測試了一個客戶端連接,關閉了一個郵件伺服器,在5-10秒不活動後,連接恢復活動而不重新啟動MUA。