Kubernetes Calico 網路:calicoctl 報告“reset by peer”和“bird: BGP: Unexpected connect from unknown address”
這是一個在裸機上使用 Kubespray 建構的新集群。
calicoctl
報告不Established
狀態的問題,StatefulSet
成員之間無法相互通信,並且大多數Ingress
請求大約需要 10 秒才能打開範例 Nginx 頁面。所有其他組件,例如 etcd、pod
sudo kubectl get cs
和sudo kubectl cluster-info dump
都可以。master-1 (192.168.250.111) 和 node-1 (192.168.250.112) 上的 calico-node pod 在日誌中報告沒有錯誤
master-2 (192.168.240.111) 和 node-1 (192.168.240.112) 上的 calico-node pod 在日誌中報告錯誤
bird: BGP: Unexpected connect from unknown address 192.168.240.240 (port 36597)
- 此 IP 是 VPN 路由器的 IP(這些伺服器的網關)master-3 (192.168.230.111) 和 node-3 (192.168.230.112) 上的 calico-node pod 在日誌中報告錯誤
bird: BGP: Unexpected connect from unknown address 192.168.230.230 (port 35029)
- 此 IP 是 VPN 路由器的 IP(這些伺服器的網關)192.168.250.112(節點 1):
era@server-node-1:~$ sudo calicoctl node status Calico process is running. IPv4 BGP status +-----------------+-------------------+-------+----------+--------------------------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +-----------------+-------------------+-------+----------+--------------------------------+ | 192.168.250.111 | node-to-node mesh | up | 19:54:47 | Established | | 192.168.240.111 | node-to-node mesh | start | 19:54:35 | Active Socket: Connection | | | | | | reset by peer | | 192.168.230.111 | node-to-node mesh | up | 20:42:31 | Established | | 192.168.240.112 | node-to-node mesh | start | 19:54:35 | Active Socket: Connection | | | | | | reset by peer | | 192.168.230.112 | node-to-node mesh | up | 20:42:30 | Established | +-----------------+-------------------+-------+----------+--------------------------------+ IPv6 BGP status No IPv6 peers found. era@server-node-1:~$
192.168.240.112(節點 2):
era@server-node-2:~$ sudo calicoctl node status Calico process is running. IPv4 BGP status +-----------------+-------------------+-------+----------+--------------------------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +-----------------+-------------------+-------+----------+--------------------------------+ | 192.168.250.111 | node-to-node mesh | start | 19:52:09 | Passive | | 192.168.240.111 | node-to-node mesh | up | 19:54:37 | Established | | 192.168.230.111 | node-to-node mesh | start | 19:52:09 | Active Socket: Connection | | | | | | reset by peer | | 192.168.250.112 | node-to-node mesh | start | 19:52:09 | Passive | | 192.168.230.112 | node-to-node mesh | start | 19:52:09 | Active Socket: Connection | | | | | | reset by peer | +-----------------+-------------------+-------+----------+--------------------------------+ IPv6 BGP status No IPv6 peers found. era@server-node-2:~$
192.168.230.112(節點 3):
era@server-node-3:~$ sudo calicoctl node status Calico process is running. IPv4 BGP status +-----------------+-------------------+-------+----------+-------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +-----------------+-------------------+-------+----------+-------------+ | 192.168.250.111 | node-to-node mesh | up | 20:42:31 | Established | | 192.168.240.111 | node-to-node mesh | start | 19:51:59 | Passive | | 192.168.230.111 | node-to-node mesh | up | 19:54:25 | Established | | 192.168.250.112 | node-to-node mesh | up | 20:42:30 | Established | | 192.168.240.112 | node-to-node mesh | start | 19:51:59 | Passive | +-----------------+-------------------+-------+----------+-------------+ IPv6 BGP status No IPv6 peers found. era@server-node-3:~$
我試圖設置確切的網路介面,看看它是否有幫助 - 沒有幫助:
era@server-master-1:~$ kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=interface=ens3 daemonset.apps/calico-node env updated
嘗試使用
nc
179 測試從任何節點和主節點到任何節點和主節點的埠,他們成功了。Ubuntu 18.04 用於作業系統。
有什麼建議可以在 Calico 中調試以解決問題嗎?任何提示對於更接近解決方案都是有用的。
更新
我發現問題與失去的路線相關。
下面是 192.168.250.112 的輸出。所以它無法到達 192.168.240.x 中的節點和主節點,因為沒有路由:
era@server-node-1:~$ ip route | grep tun 10.233.76.0/24 via 192.168.230.112 dev tunl0 proto bird onlink 10.233.77.0/24 via 192.168.230.111 dev tunl0 proto bird onlink 10.233.79.0/24 via 192.168.250.111 dev tunl0 proto bird onlink era@server-node-1:~$ sudo calicoctl node status Calico process is running. IPv4 BGP status +-----------------+-------------------+-------+----------+--------------------------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +-----------------+-------------------+-------+----------+--------------------------------+ | 192.168.250.111 | node-to-node mesh | up | 21:39:05 | Established | | 192.168.240.111 | node-to-node mesh | start | 19:54:35 | Connect Socket: Connection | | | | | | reset by peer | | 192.168.230.111 | node-to-node mesh | up | 20:42:31 | Established | | 192.168.240.112 | node-to-node mesh | start | 19:54:35 | Connect Socket: Connection | | | | | | reset by peer | | 192.168.230.112 | node-to-node mesh | up | 20:42:30 | Established | +-----------------+-------------------+-------+----------+--------------------------------+ IPv6 BGP status No IPv6 peers found. era@server-node-1:~$
下面是 192.168.240.112 的輸出。所以它無法到達 192.168.250.x 和 192.168.230.x 中的節點和主節點,因為沒有路由:
era@server-node-2:~$ ip r | grep tunl 10.233.66.0/24 via 192.168.240.111 dev tunl0 proto bird onlink era@server-node-2:~$ sudo calicoctl node status Calico process is running. IPv4 BGP status +-----------------+-------------------+-------+----------+--------------------------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +-----------------+-------------------+-------+----------+--------------------------------+ | 192.168.250.111 | node-to-node mesh | start | 19:52:10 | Passive | | 192.168.240.111 | node-to-node mesh | up | 19:54:38 | Established | | 192.168.230.111 | node-to-node mesh | start | 22:05:18 | Active Socket: Connection | | | | | | reset by peer | | 192.168.250.112 | node-to-node mesh | start | 19:52:10 | Passive | | 192.168.230.112 | node-to-node mesh | start | 22:05:22 | Active Socket: Connection | | | | | | reset by peer | +-----------------+-------------------+-------+----------+--------------------------------+ IPv6 BGP status No IPv6 peers found. era@server-node-2:~$
下面是 192.168.230.112 的輸出。所以它無法到達 192.168.240.x 中的節點和主節點,因為沒有路由:
era@server-node-3:~$ ip r | grep tunl 10.233.77.0/24 via 192.168.230.111 dev tunl0 proto bird onlink 10.233.79.0/24 via 192.168.250.111 dev tunl0 proto bird onlink 10.233.100.0/24 via 192.168.250.112 dev tunl0 proto bird onlink era@server-node-3:~$ sudo calicoctl node status Calico process is running. IPv4 BGP status +-----------------+-------------------+-------+----------+-------------+ | PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO | +-----------------+-------------------+-------+----------+-------------+ | 192.168.250.111 | node-to-node mesh | up | 21:36:50 | Established | | 192.168.240.111 | node-to-node mesh | start | 19:51:59 | Passive | | 192.168.230.111 | node-to-node mesh | up | 19:54:25 | Established | | 192.168.250.112 | node-to-node mesh | up | 20:42:30 | Established | | 192.168.240.112 | node-to-node mesh | start | 19:51:59 | Passive | +-----------------+-------------------+-------+----------+-------------+ IPv6 BGP status No IPv6 peers found. era@server-node-3:~$
那麼為什麼這些路線不存在以及如何通過添加它們來改變這種行為呢?如果我手動添加,路線會自動刪除。
問題是在 VPN TUN(第 3 層)上應用了 NAT。Calico 不支持它(或者我不熟悉可用的 NATed 解決方案)。
解決方案:使用路由而不是 NAT