Cluster
具有兩個網路介面的故障轉移起搏器集群?
所以,我在一個 vlan 中有兩台測試伺服器。
srv1 eth1 10.10.10.11 eth2 10.20.10.11 srv2 eth1 10.10.10.12 eth2 10.20.10.12 Cluster VIP - 10.10.10.100
具有兩個介面的 Corosync 配置:
rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 10.10.10.0 mcastaddr: 226.94.1.1 mcastport: 5405 } interface { ringnumber: 1 bindnetaddr: 10.20.10.0 mcastaddr: 226.94.1.1 mcastport: 5407 }
起搏器配置:
# crm configure show node srv1 node srv2 primitive cluster-ip ocf:heartbeat:IPaddr2 \ params ip="10.10.10.100" cidr_netmask="24" \ op monitor interval="5s" primitive ha-nginx lsb:nginx \ op monitor interval="5s" location prefer-srv-2 ha-nginx 50: srv2 colocation nginx-and-cluster-ip +inf: ha-nginx cluster-ip property $id="cib-bootstrap-options" \ dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false"
地位:
# crm status ============ Last updated: Thu Jan 29 13:40:16 2015 Last change: Thu Jan 29 12:47:25 2015 via crmd on srv1 Stack: openais Current DC: srv2 - partition with quorum Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ srv1 srv2 ] cluster-ip (ocf::heartbeat:IPaddr2): Started srv2 ha-nginx (lsb:nginx): Started srv2
戒指:
# corosync-cfgtool -s Printing ring status. Local node ID 185207306 RING ID 0 id = 10.10.10.11 status = ring 0 active with no faults RING ID 1 id = 10.20.10.11 status = ring 1 active with no faults
而且,如果我這樣做
srv2# ifconfig eth1 down
了,起搏器仍然在 eth2 上工作,沒關係。 但是nginx 在 10.10.10.100 上不可用(因為 eth1 關閉了,是的),pacemeker說,一切正常。但是,我希望在 eth1 在 srv2 上死掉之後,nginx 移動到 srv1。
那麼,我能做些什麼呢?
所以,感謝@Dok,我用 ocf:pacemaker:ping 解決了我的問題。
# crm configure show node srv1 node srv2 primitive P_INTRANET ocf:pacemaker:ping \ params host_list="10.10.10.11 10.10.10.12" multiplier="100" name="ping_intranet" \ op monitor interval="5s" timeout="5s" primitive cluster-ip ocf:heartbeat:IPaddr2 \ params ip="10.10.10.100" cidr_netmask="24" \ op monitor interval="5s" primitive ha-nginx lsb:nginx \ op monitor interval="5s" clone CL_INTRANET P_INTRANET \ meta globally-unique="false" location L_CLUSTER_IP_PING_INTRANET cluster-ip \ rule $id="L_CLUSTER_IP_PING_INTRANET-rule" ping_intranet: defined ping_intranet location L_HA_NGINX_PING_INTRANET ha-nginx \ rule $id="L_HA_NGINX_PING_INTRANET-rule" ping_intranet: defined ping_intranet location L_INTRANET_01 CL_INTRANET 100: srv1 location L_INTRANET_02 CL_INTRANET 100: srv2 colocation nginx-and-cluster-ip 1000: ha-nginx cluster-ip property $id="cib-bootstrap-options" \ dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false"
ocf:pacemaker:pingd 資源專門設計用於在失去連接時對節點進行故障轉移。您可以在此處的集群實驗室 wiki 上找到一個非常簡短的範例:http: //clusterlabs.org/wiki/Example_configurations#Set_up_pingd
有點不相關,但我過去曾看到過
ifconfig down
用於測試連接失去的問題。我強烈建議您改為使用 iptables 來丟棄流量以測試連接失去。