Kolla OpenStack 部署失敗並顯示“haproxy:等待虛擬 IP 出現”
我正在嘗試按照官方指南在 Ubuntu 主機上使用 kolla-ansible (7.0.0) 部署 OpenStack Queens 。
成功後
bootstrap-servers
命令precheck
失敗deploy
:RUNNING HANDLER [haproxy : Waiting for virtual IP to appear] ********************************************************** fatal: [testcloudcontrol01]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for 10.52.41.98:3306"} fatal: [testcloudcontrol02]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for 10.52.41.98:3306"}
檢查失敗的原因是
kolla_internal_vip_address
沒有出現。全域變數.yml
config_strategy: "COPY_ALWAYS" kolla_base_distro: "ubuntu" kolla_install_type: "binary" openstack_release: "queens" kolla_internal_vip_address: "10.52.41.98" kolla_internal_fqdn: "testcloudapi.example.com" kolla_external_vip_address: "{{ kolla_internal_vip_address }}" kolla_external_fqdn: "{{ kolla_internal_fqdn }}" network_interface: "ens160" api_interface: "ens160" storage_interface: "ens161" keepalived_virtual_router_id: "148"
我目前固定在皇后區,因為我想複製我們的生產環境進行測試。
ip addr
在應該部署 haproxy 的節點之一上的輸出:1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:a1:6a:2c brd ff:ff:ff:ff:ff:ff inet 10.52.41.100/24 brd 10.52.41.255 scope global ens160 valid_lft forever preferred_lft forever inet6 fe80::250:56ff:fea1:6a2c/64 scope link valid_lft forever preferred_lft forever 3: ens161: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:a1:7d:07 brd ff:ff:ff:ff:ff:ff inet 10.52.42.100/24 brd 10.52.42.255 scope global ens161 valid_lft forever preferred_lft forever inet6 fe80::250:56ff:fea1:7d07/64 scope link valid_lft forever preferred_lft forever 4: ens224: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:a1:23:6e brd ff:ff:ff:ff:ff:ff inet 10.52.40.100/24 brd 10.52.40.255 scope global ens224 valid_lft forever preferred_lft forever inet6 fe80::250:56ff:fea1:236e/64 scope link valid_lft forever preferred_lft forever 5: ens256: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:a1:20:12 brd ff:ff:ff:ff:ff:ff inet 10.52.44.100/24 brd 10.52.44.255 scope global ens256 valid_lft forever preferred_lft forever inet6 fe80::250:56ff:fea1:2012/64 scope link valid_lft forever preferred_lft forever 6: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:b0:8a:93:e7 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global docker0 valid_lft forever preferred_lft forever
這些節點是具有 VMXNet3 網卡的 VMware 虛擬機。
輸出
docker logs keepalived
:+ sudo -E kolla_set_configs INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json INFO:__main__:Validating config file INFO:__main__:Kolla config strategy set to: COPY_ALWAYS INFO:__main__:Copying service configuration files INFO:__main__:Deleting /etc/keepalived/keepalived.conf INFO:__main__:Copying /var/lib/kolla/config_files/keepalived.conf to /etc/keepalived/keepalived.conf INFO:__main__:Setting permission for /etc/keepalived/keepalived.conf INFO:__main__:Writing out command to execute ++ cat /run_command + CMD='/usr/sbin/keepalived -nld -p /run/keepalived.pid' + ARGS= + [[ ! -n '' ]] + . kolla_extend_start ++ modprobe ip_vs ++ '[' -f /run/keepalived.pid ']' + echo 'Running command: '\''/usr/sbin/keepalived -nld -p /run/keepalived.pid'\''' Running command: '/usr/sbin/keepalived -nld -p /run/keepalived.pid' + exec /usr/sbin/keepalived -nld -p /run/keepalived.pid Thu Dec 13 12:10:26 2018: Starting Keepalived v1.3.9 (10/21,2017) Thu Dec 13 12:10:26 2018: Opening file '/etc/keepalived/keepalived.conf'. Thu Dec 13 12:10:26 2018: Starting Healthcheck child process, pid=11 Thu Dec 13 12:10:26 2018: Opening file '/etc/keepalived/keepalived.conf'. Thu Dec 13 12:10:26 2018: Starting VRRP child process, pid=12 Thu Dec 13 12:10:26 2018: ------< Global definitions >------ Thu Dec 13 12:10:26 2018: Router ID = testcloudcontrol01.example.com Thu Dec 13 12:10:26 2018: Default interface = eth0 Thu Dec 13 12:10:26 2018: LVS flush = false Thu Dec 13 12:10:26 2018: VRRP IPv4 mcast group = 224.0.0.18 Thu Dec 13 12:10:26 2018: VRRP IPv6 mcast group = ff02::12 Thu Dec 13 12:10:26 2018: Gratuitous ARP delay = 5 Thu Dec 13 12:10:26 2018: Gratuitous ARP repeat = 5 Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh timer = 0 Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh repeat = 1 Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority delay = 4294 Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority repeat = -1 Thu Dec 13 12:10:26 2018: Send advert after receive lower priority advert = true Thu Dec 13 12:10:26 2018: Send advert after receive higher priority advert = false Thu Dec 13 12:10:26 2018: Gratuitous ARP interval = 0 Thu Dec 13 12:10:26 2018: Gratuitous NA interval = 0 Thu Dec 13 12:10:26 2018: VRRP default protocol version = 2 Thu Dec 13 12:10:26 2018: Iptables input chain = INPUT Thu Dec 13 12:10:26 2018: Using ipsets = true Thu Dec 13 12:10:26 2018: ipset IPv4 address set = keepalived Thu Dec 13 12:10:26 2018: ipset IPv6 address set = keepalived6 Thu Dec 13 12:10:26 2018: ipset IPv6 address,iface set = keepalived_if6 Thu Dec 13 12:10:26 2018: VRRP check unicast_src = false Thu Dec 13 12:10:26 2018: VRRP skip check advert addresses = false Thu Dec 13 12:10:26 2018: VRRP strict mode = false Thu Dec 13 12:10:26 2018: VRRP process priority = 0 Thu Dec 13 12:10:26 2018: VRRP don't swap = false Thu Dec 13 12:10:26 2018: Checker process priority = 0 Thu Dec 13 12:10:26 2018: Checker don't swap = false Thu Dec 13 12:10:26 2018: SNMP keepalived disabled Thu Dec 13 12:10:26 2018: SNMP checker disabled Thu Dec 13 12:10:26 2018: SNMP RFCv2 disabled Thu Dec 13 12:10:26 2018: SNMP RFCv3 disabled Thu Dec 13 12:10:26 2018: SNMP traps disabled Thu Dec 13 12:10:26 2018: SNMP socket = default (unix:/var/agentx/master) Thu Dec 13 12:10:26 2018: Network namespace = (default) Thu Dec 13 12:10:26 2018: DBus disabled Thu Dec 13 12:10:26 2018: DBus service name = (null) Thu Dec 13 12:10:26 2018: Script security disabled Thu Dec 13 12:10:26 2018: Default script uid:gid 0:0 Thu Dec 13 12:10:26 2018: Registering Kernel netlink reflector Thu Dec 13 12:10:26 2018: Registering Kernel netlink command channel Thu Dec 13 12:10:26 2018: Registering gratuitous ARP shared channel Thu Dec 13 12:10:26 2018: Opening file '/etc/keepalived/keepalived.conf'. Thu Dec 13 12:10:26 2018: WARNING - default user 'keepalived_script' for script execution does not exist - please create. Thu Dec 13 12:10:26 2018: Truncating auth_pass to 8 characters Thu Dec 13 12:10:26 2018: SECURITY VIOLATION - scripts are being executed but script_security not enabled. Thu Dec 13 12:10:26 2018: ------< Global definitions >------ Thu Dec 13 12:10:26 2018: Router ID = testcloudcontrol01.example.com Thu Dec 13 12:10:26 2018: Default interface = eth0 Thu Dec 13 12:10:26 2018: LVS flush = false Thu Dec 13 12:10:26 2018: VRRP IPv4 mcast group = 224.0.0.18 Thu Dec 13 12:10:26 2018: VRRP IPv6 mcast group = ff02::12 Thu Dec 13 12:10:26 2018: Gratuitous ARP delay = 5 Thu Dec 13 12:10:26 2018: Gratuitous ARP repeat = 5 Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh timer = 0 Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh repeat = 1 Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority delay = 5 Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority repeat = 5 Thu Dec 13 12:10:26 2018: Send advert after receive lower priority advert = true Thu Dec 13 12:10:26 2018: Send advert after receive higher priority advert = false Thu Dec 13 12:10:26 2018: Gratuitous ARP interval = 0 Thu Dec 13 12:10:26 2018: Gratuitous NA interval = 0 Thu Dec 13 12:10:26 2018: VRRP default protocol version = 2 Thu Dec 13 12:10:26 2018: Iptables input chain = INPUT Thu Dec 13 12:10:26 2018: Using ipsets = false Thu Dec 13 12:10:26 2018: ipset IPv4 address set = keepalived Thu Dec 13 12:10:26 2018: ipset IPv6 address set = keepalived6 Thu Dec 13 12:10:26 2018: ipset IPv6 address,iface set = keepalived_if6 Thu Dec 13 12:10:26 2018: VRRP check unicast_src = false Thu Dec 13 12:10:26 2018: VRRP skip check advert addresses = false Thu Dec 13 12:10:26 2018: VRRP strict mode = false Thu Dec 13 12:10:26 2018: VRRP process priority = 0 Thu Dec 13 12:10:26 2018: VRRP don't swap = false Thu Dec 13 12:10:26 2018: Checker process priority = 0 Thu Dec 13 12:10:26 2018: Checker don't swap = false Thu Dec 13 12:10:26 2018: SNMP keepalived disabled Thu Dec 13 12:10:26 2018: SNMP checker disabled Thu Dec 13 12:10:26 2018: SNMP RFCv2 disabled Thu Dec 13 12:10:26 2018: SNMP RFCv3 disabled Thu Dec 13 12:10:26 2018: SNMP traps disabled Thu Dec 13 12:10:26 2018: SNMP socket = default (unix:/var/agentx/master) Thu Dec 13 12:10:26 2018: Network namespace = (default) Thu Dec 13 12:10:26 2018: DBus disabled Thu Dec 13 12:10:26 2018: DBus service name = (null) Thu Dec 13 12:10:26 2018: Script security disabled Thu Dec 13 12:10:26 2018: Default script uid:gid 0:0 Thu Dec 13 12:10:26 2018: ------< VRRP Topology >------ Thu Dec 13 12:10:26 2018: VRRP Instance = kolla_internal_vip_148 Thu Dec 13 12:10:26 2018: Using VRRPv2 Thu Dec 13 12:10:26 2018: Want State = BACKUP Thu Dec 13 12:10:26 2018: Running on device = ens160 Thu Dec 13 12:10:26 2018: Skip checking advert IP addresses = no Thu Dec 13 12:10:26 2018: Enforcing strict VRRP compliance = no Thu Dec 13 12:10:26 2018: Using src_ip = 10.52.41.100 Thu Dec 13 12:10:26 2018: Gratuitous ARP delay = 5 Thu Dec 13 12:10:26 2018: Gratuitous ARP repeat = 5 Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh timer = 0 Thu Dec 13 12:10:26 2018: Gratuitous ARP refresh repeat = 1 Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority delay = 5 Thu Dec 13 12:10:26 2018: Gratuitous ARP lower priority repeat = 5 Thu Dec 13 12:10:26 2018: Send advert after receive lower priority advert = true Thu Dec 13 12:10:26 2018: Send advert after receive higher priority advert = false Thu Dec 13 12:10:26 2018: Virtual Router ID = 148 Thu Dec 13 12:10:26 2018: Priority = 1 Thu Dec 13 12:10:26 2018: Advert interval = 1 sec Thu Dec 13 12:10:26 2018: Accept enabled Thu Dec 13 12:10:26 2018: Preempt disabled Thu Dec 13 12:10:26 2018: Promote_secondaries disabled Thu Dec 13 12:10:26 2018: Authentication type = SIMPLE_PASSWORD Thu Dec 13 12:10:26 2018: Password = 0RXbQYFF Thu Dec 13 12:10:26 2018: Tracked scripts = 1 Thu Dec 13 12:10:26 2018: check_alive weight 0 Thu Dec 13 12:10:26 2018: Virtual IP = 1 Thu Dec 13 12:10:26 2018: 10.52.41.98/32 dev ens160 scope global Thu Dec 13 12:10:26 2018: ------< VRRP Scripts >------ Thu Dec 13 12:10:26 2018: VRRP Script = check_alive Thu Dec 13 12:10:26 2018: Command = /check_alive.sh Thu Dec 13 12:10:26 2018: Interval = 2 sec Thu Dec 13 12:10:26 2018: Timeout = 0 sec Thu Dec 13 12:10:26 2018: Weight = 0 Thu Dec 13 12:10:26 2018: Rise = 10 Thu Dec 13 12:10:26 2018: Fall = 2 Thu Dec 13 12:10:26 2018: Insecure = no Thu Dec 13 12:10:26 2018: Status = INIT Thu Dec 13 12:10:26 2018: Script uid:gid = 0:0 Thu Dec 13 12:10:26 2018: ------< NIC >------ Thu Dec 13 12:10:26 2018: Name = lo Thu Dec 13 12:10:26 2018: index = 1 Thu Dec 13 12:10:26 2018: IPv4 address = 127.0.0.1 Thu Dec 13 12:10:26 2018: IPv6 address = :: Thu Dec 13 12:10:26 2018: is UP Thu Dec 13 12:10:26 2018: is RUNNING Thu Dec 13 12:10:26 2018: MTU = 65536 Thu Dec 13 12:10:26 2018: HW Type = LOOPBACK Thu Dec 13 12:10:26 2018: ------< NIC >------ Thu Dec 13 12:10:26 2018: Name = ens160 Thu Dec 13 12:10:26 2018: index = 2 Thu Dec 13 12:10:26 2018: IPv4 address = 10.52.41.100 Thu Dec 13 12:10:26 2018: IPv6 address = fe80::250:56ff:fea1:6a2c Thu Dec 13 12:10:26 2018: MAC = 00:50:56:a1:6a:2c Thu Dec 13 12:10:26 2018: is UP Thu Dec 13 12:10:26 2018: is RUNNING Thu Dec 13 12:10:26 2018: MTU = 1500 Thu Dec 13 12:10:26 2018: HW Type = ETHERNET Thu Dec 13 12:10:26 2018: ------< NIC >------ Thu Dec 13 12:10:26 2018: Name = ens161 Thu Dec 13 12:10:26 2018: index = 3 Thu Dec 13 12:10:26 2018: IPv4 address = 10.52.42.100 Thu Dec 13 12:10:26 2018: IPv6 address = fe80::250:56ff:fea1:7d07 Thu Dec 13 12:10:26 2018: MAC = 00:50:56:a1:7d:07 Thu Dec 13 12:10:26 2018: is UP Thu Dec 13 12:10:26 2018: is RUNNING Thu Dec 13 12:10:26 2018: MTU = 1500 Thu Dec 13 12:10:26 2018: HW Type = ETHERNET Thu Dec 13 12:10:26 2018: ------< NIC >------ Thu Dec 13 12:10:26 2018: Name = ens224 Thu Dec 13 12:10:26 2018: index = 4 Thu Dec 13 12:10:26 2018: IPv4 address = 10.52.40.100 Thu Dec 13 12:10:26 2018: IPv6 address = fe80::250:56ff:fea1:236e Thu Dec 13 12:10:26 2018: MAC = 00:50:56:a1:23:6e Thu Dec 13 12:10:26 2018: is UP Thu Dec 13 12:10:26 2018: is RUNNING Thu Dec 13 12:10:26 2018: MTU = 1500 Thu Dec 13 12:10:26 2018: HW Type = ETHERNET Thu Dec 13 12:10:26 2018: ------< NIC >------ Thu Dec 13 12:10:26 2018: Name = ens256 Thu Dec 13 12:10:26 2018: index = 5 Thu Dec 13 12:10:26 2018: IPv4 address = 10.52.44.100 Thu Dec 13 12:10:26 2018: IPv6 address = fe80::250:56ff:fea1:2012 Thu Dec 13 12:10:26 2018: MAC = 00:50:56:a1:20:12 Thu Dec 13 12:10:26 2018: is UP Thu Dec 13 12:10:26 2018: is RUNNING Thu Dec 13 12:10:26 2018: MTU = 1500 Thu Dec 13 12:10:26 2018: HW Type = ETHERNET Thu Dec 13 12:10:26 2018: ------< NIC >------ Thu Dec 13 12:10:26 2018: Name = docker0 Thu Dec 13 12:10:26 2018: index = 6 Thu Dec 13 12:10:26 2018: IPv4 address = 172.17.0.1 Thu Dec 13 12:10:26 2018: IPv6 address = :: Thu Dec 13 12:10:26 2018: MAC = 02:42:b0:8a:93:e7 Thu Dec 13 12:10:26 2018: is UP Thu Dec 13 12:10:26 2018: MTU = 1500 Thu Dec 13 12:10:26 2018: HW Type = ETHERNET Thu Dec 13 12:10:26 2018: Using LinkWatch kernel netlink reflector... Thu Dec 13 12:10:26 2018: VRRP_Instance(kolla_internal_vip_148) Entering BACKUP STATE Thu Dec 13 12:10:26 2018: /check_alive.sh exited with status 1 Thu Dec 13 12:10:28 2018: /check_alive.sh exited with status 1 Thu Dec 13 12:10:30 2018: VRRP_Instance(kolla_internal_vip_148) Now in FAULT state Thu Dec 13 12:10:30 2018: /check_alive.sh exited with status 1 Thu Dec 13 12:10:32 2018: /check_alive.sh exited with status 1 [message repeats until I stop the container]
就是這樣,兩個keepalived 實例都處於FAULT 狀態,IP 地址未在任何VM 上啟動。
我經歷了這個問題和答案,即使我在日誌文件中沒有錯誤消息:
- keepalived_virtual_router_id 已更改並且是唯一的
- 我又跑
kolla-genpwd
了。我確認keepalived_password
設置在/etc/kolla/passwords.yml
kolla_internal_vip_address
可從 訪問network_interface
。該介面上的主 IP 在同一網路中。我可以手動設置額外的 IP 地址,它可以工作。kolla-ansible prechecks
通行證- selinux 在 Ubuntu 上不活躍
在管理程序方面,我嘗試啟用
Promiscuous mode
該介面的埠組。這並沒有什麼不同。
所以,在裸機上遇到同樣的問題後,我更深入地研究了這個問題。原來它不是keepalived,而是有問題的haproxy容器。
haproxy 容器不斷重啟是因為 haproxy 是用命令行參數啟動的
-W
,而在容器中自帶的 haproxy 版本中是不存在的。Running command: '/usr/sbin/haproxy -W -db -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid' + exec /usr/sbin/haproxy -W -db -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid HA-Proxy version 1.6.3 2015/12/25 Copyright 2000-2015 Willy Tarreau <willy@haproxy.org> Usage : haproxy [-f <cfgfile>]* [ -vdVD ] [ -n <maxconn> ] [ -N <maxpconn> ] [ -p <pidfile> ] [ -m <max megs> ] [ -C <dir> ] [-- <cfgfile>*] -v displays version ; -vv shows known build options. -d enters debug mode ; -db only disables background mode. -dM[<byte>] poisons memory with <byte> (defaults to 0x50) -V enters verbose mode (disables quiet mode) -D goes daemon ; -C changes to <dir> before loading files. -q quiet mode : don't display messages -c check mode : only check config files and exit -n sets the maximum total # of connections (2000) -m limits the usable amount of memory (in MB) -N sets the default, per-proxy maximum # of connections (2000) -L set local peer name (default to hostname) -p writes pids of all children to this file -de disables epoll() usage even when available -dp disables poll() usage even when available -dS disables splice usage (broken on old kernels) -dV disables SSL verify on servers side -sf/-st [pid ]* finishes/terminates old pids.
因此,haproxy 容器不斷重啟。另一方面,keepalived 容器配置了 keepalived 的檢查腳本,該腳本不斷退出並出現錯誤:
Fri Feb 15 08:17:14 2019: /check_alive.sh exited with status 1 Keepalived_vrrp[12]: /check_alive.sh exited with status 1
這個檢查腳本非常簡單,它通過一個套接字文件檢查 haproxy 的狀態:
#!/bin/bash # This will return 0 when it successfully talks to the haproxy daemon via the socket # Failures return 1 echo "show info" | socat unix-connect:/var/lib/kolla/haproxy/haproxy.sock stdio > /dev/null
所以……只要使用無效參數呼叫 haproxy 並且沒有啟動,keepalived 就會保持
FAULT
狀態,沒有浮動 IP。使用
grep -R "haproxy -W" *
我發現 haproxy 的命令行是在文件中定義的/usr/local/share/kolla-ansible/ansible/roles/haproxy/templates/haproxy.json.j2
。-W
我從命令行中刪除了參數,這導致 haproxy 正常啟動,keepalived 更改為MASTER
配置浮動 IP 的狀態。Launchpad 上已經有關於此問題的錯誤報告。評論中還有一個略有不同的解決方案(更改相同的文件)。
當文件更新時,這種修改當然會被恢復。如果您有同樣的問題,請登錄 Launchpad 並標記該錯誤(已於 2018-06-08 報告)影響您,以便優先處理並修復。