Cluster
伺服器“未執行”(7) 上的 pcs 狀態錯誤 httpd_monitor_5000:
錯誤資訊
Failed actions: httpd_monitor_5000 on abc-zabserver-b 'not running' (7): call=65, status=complete, last-rc-change='Wed Jul 15 21:44:43 2015', queued=0ms, exec=8ms
個人電腦狀態
[root@abc-zabserver-b ~]# pcs status Cluster name: abc-zabvip Last updated: Wed Jul 15 21:50:52 2015 Last change: Wed Jul 15 20:38:07 2015 Stack: cman Current DC: abc-zabserver-b - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 3 Resources configured Online: [ abc-zabserver-a abc-zabserver-b ] Full list of resources: Resource Group: zabbix-cluster ClusterIP (ocf::heartbeat:IPaddr2): Started abc-zabserver-b zabbix-server (lsb:zabbix-server): Started abc-zabserver-b httpd (lsb:httpd): Started abc-zabserver-b Failed actions: httpd_monitor_5000 on abc-zabserver-b 'not running' (7): call=65, status=complete, last-rc-change='Wed Jul 15 21:44:43 2015', queued=0ms, exec=8ms
資源配置
pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=10.99.122.69 cidr_netmask=24 op monitor interval=5s pcs property set stonith-enabled=false pcs resource create zabbix-server lsb:zabbix-server op monitor interval=5s pcs resource create httpd lsb:httpd op monitor interval=5s pcs resource group add zabbix-cluster ClusterIP zabbix-server httpd pcs property set no-quorum-policy=ignore pcs property set default-resource-stickiness="100"
電腦配置顯示
[root@abc-zabserver-b ~]# pcs config show Cluster Name: abc-zabvip Corosync Nodes: abc-zabserver-a abc-zabserver-b Pacemaker Nodes: abc-zabserver-a abc-zabserver-b Resources: Group: zabbix-cluster Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.99.122.69 cidr_netmask=24 Operations: start interval=0s timeout=20s (ClusterIP-start-timeout-20s) stop interval=0s timeout=20s (ClusterIP-stop-timeout-20s) monitor interval=5s (ClusterIP-monitor-interval-5s) Resource: zabbix-server (class=lsb type=zabbix-server) Operations: monitor interval=5s (zabbix-server-monitor-interval-5s) Resource: httpd (class=lsb type=httpd) Operations: monitor interval=5s (httpd-monitor-interval-5s) Stonith Devices: Fencing Levels: Location Constraints: Ordering Constraints: Colocation Constraints: Cluster Properties: cluster-infrastructure: cman dc-version: 1.1.11-97629de default-resource-stickiness: 100 no-quorum-policy: ignore stonith-enabled: false
集群配置文件
[root@abc-zabserver-b ~]# cat /etc/cluster/cluster.conf <cluster config_version="9" name="abc-zabvip"> <fence_daemon/> <clusternodes> <clusternode name="abc-zabserver-a" nodeid="1"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="abc-zabserver-a"/> </method> </fence> </clusternode> <clusternode name="abc-zabserver-b" nodeid="2"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="abc-zabserver-b"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="1" port="5405" transport="udpu" two_node="1"/> <fencedevices> <fencedevice agent="fence_pcmk" name="pcmk"/> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster>
我通過在 httpd.conf 中取消註釋狀態 URL 並以這種方式創建資源來修復它。在添加資源之前確保可以訪問http://localhost/server-status
pcs resource create httpd apache configfile="/etc/httpd/conf/httpd.conf" statusurl="http://localhost/server-status" op monitor interval=5s --group zabbix-cluster
httpd 資源似乎正在執行(基於您顯示的 pcs 狀態輸出)。也許在 Pacemaker 監控服務時,有什麼東西停止了服務,這會拋出您在上面看到的錯誤,並觸發恢復。
如果您對“LogActions”的日誌(在 DC:“目前 DC:vda-zabserver-b - 具有仲裁的分區”)進行 grep,您應該會看到 Pacemaker 對資源執行的任何啟動/停止/恢復/重新啟動/離開操作。
如果是這種情況,您需要確保除了 Pacemaker 之外沒有其他任何東西在管理這些集群服務;Pacemaker 期望成為啟動和停止這些服務的唯一方式。
您可以通過執行以下命令來清除錯誤:
# pcs resource cleanup httpd
返回碼 7,通常表示 Pacemaker 檢查其狀態時服務未執行。
http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html http://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html