Iptables

IP 數據包停留在路由決策中

  • July 5, 2019

首先,這是我的基礎設施的外觀及其工作原理:

在此處輸入圖像描述

Controller1/2 和 Compute1/2 都執行 VM 並通過 VPN 相互連結。在每台伺服器上,br-ext 介面都插入了 ext 介面(vpn 介面)。所有伺服器都能夠一起通信,VM 在它們的私有介面上也是如此。

我有兩個 ubuntu 16.04 路由器(帶有 ETH3 和 BR-ext 的 2 框),一次只有一個處於活動狀態(第二個是使用 keepalived 進行故障轉移)並同時擁有公共子網(51.38.XY/27 ) 和 IP 10.38.166.190(充當所有 VM 的網關)。

我使用 Iptables 和 Iproute2 來允許流量讓我們說 51.38.X.YYA 到達 10.38.X.YYA,並從 10.38.X.YYA 到 51.38.X.YYA。

從其中一個虛擬機,我可以毫無問題地到達外部,如果我執行 curl ifconfig.co,我會收到公共 IP 提示,這是我想要的行為。

我的問題:

如果我嘗試使用 VM1 的公共 IP 從 VM2 訪問它,它根本不起作用。

我將使用兩個 VM 來說明我的問題,並提供有關它的所有配置:

VM1:10.38.166.167 / 51.38.166.167 VM2:10.38.166.166 / 51.38.166.166

到目前為止我所做的:

在路由器 1 上:

ETH1 = 主介面(管理) ETH3 = 將所有 IP 和 NAT 保存到 VM 的介面 br-ext = 包含 VPN 介面的網橋 ext = VPN 介面(插入網橋 br-ext)

[root@network3] ~# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
   link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
   inet 127.0.0.1/8 scope host lo
      valid_lft forever preferred_lft forever
   inet6 ::1/128 scope host
      valid_lft forever preferred_lft forever

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
   link/ether fa:16:3e:19:3e:41 brd ff:ff:ff:ff:ff:ff
   inet 51.38.166.162/32 brd 51.38.x.162 scope global eth1
      valid_lft forever preferred_lft forever
   inet6 fe80::f816:3eff:fe19:3e41/64 scope link
      valid_lft forever preferred_lft forever

5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
   link/ether fa:16:3e:72:94:cb brd ff:ff:ff:ff:ff:ff
   inet 51.38.166.163/32 brd 51.38.x.163 scope global eth3
      valid_lft forever preferred_lft forever
   inet 51.38.166.166/32 scope global eth3
      valid_lft forever preferred_lft forever
   inet 51.38.166.167/32 scope global eth3
      valid_lft forever preferred_lft forever


7: br-ext: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
   link/ether d2:f8:64:36:64:f2 brd ff:ff:ff:ff:ff:ff
   inet 10.0.0.103/9 brd 10.127.255.255 scope global br-ext
      valid_lft forever preferred_lft forever
   inet 10.0.0.120/32 scope global br-ext
      valid_lft forever preferred_lft forever
   inet 10.38.166.190/32 scope global br-ext
      valid_lft forever preferred_lft forever

10: ext: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-ext state UNKNOWN group default qlen 1000
   link/ether d2:f8:64:36:64:f2 brd ff:ff:ff:ff:ff:ff

我設置了一堆路由,以允許來自 51.38.x.160/27 外部的數據包在 10.38.xy/27 上路由

[root@network3] ~# ip ru l | grep "lookup 103"
9997:   from 10.38.x.167 lookup 103
9998:   from 10.38.x.166 lookup 103

# rules to tells that each IP of the /27 need to use table 103
10301:  from 51.38.166.163 lookup 103
10302:  from all to 51.38.166.163 lookup 103
10307:  from 51.38.166.166 lookup 103
10308:  from all to 51.38.166.166 lookup 103
10309:  from 51.38.166.167 lookup 103
10310:  from all to 51.38.166.167 lookup 103

[root@network3] ~# ip r s table 103
default via 51.38.166.190 dev eth3
51.38.166.160/27 dev eth3  scope link

[root@network3] ~# ip r s
default via 51.38.166.190 dev eth1 onlink
10.0.0.0/9 dev br-ext  proto kernel  scope link  src 10.0.0.103
172.16.0.0/16 dev br-manag  proto kernel  scope link  src 172.16.0.103

我的 iptables 如下所示:

[root@network3] ~# iptables -nvL
Chain INPUT (policy ACCEPT 21334 packets, 1015K bytes)
pkts bytes target     prot opt in     out     source               destination
91877 4376K ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            /* 000 accept all icmp */
  18  1564 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0            /* 001 accept all to lo interface */
   0     0 REJECT     all  --  !lo    *       0.0.0.0/0            127.0.0.0/8          /* 002 reject local traffic not on loopback interface */ reject-with icmp-port-unreachable
343K  123M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state ESTABLISHED /* 003 accept related established rules */
 243 14472 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            multiport dports 1022 /* 030 allow SSH */
481M   42G ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            multiport dports 3210:3213 /* 031 allow VPNtunnel */
4155  241K DROP       all  --  eth0   *       0.0.0.0/0            0.0.0.0/0            /* 999 drop all */

Chain FORWARD (policy ACCEPT 98325 packets, 8874K bytes)
pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 964M packets, 93G bytes)
pkts bytes target     prot opt in     out     source               destination

iptables NAT 規則

[root@network3] ~# iptables -t nat -nvL --line
Chain PREROUTING (policy ACCEPT 156K packets, 6455K bytes)
num   pkts bytes target     prot opt in     out     source               destination
31   11228  771K DNAT       all  --  *      *       0.0.0.0/0            51.38.166.166        /* 112 NAT for 10.38.166.166 */ to:10.38.166.166
32   11624  809K DNAT       all  --  *      *       0.0.0.0/0            51.38.166.167        /* 112 NAT for 10.38.166.167 */ to:10.38.166.167

Chain INPUT (policy ACCEPT 85077 packets, 3527K bytes)
num   pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 16505 packets, 1294K bytes)
num   pkts bytes target     prot opt in     out     source               destination

Chain POSTROUTING (policy ACCEPT 105K packets, 4357K bytes)
num   pkts bytes target     prot opt in     out     source               destination              destination
31      17  1196 SNAT       all  --  *      *       10.38.166.166        0.0.0.0/0             to:51.38.166.166
32       8   549 SNAT       all  --  *      *       10.38.166.167        0.0.0.0/0             to:51.38.166.167

我還在 RAW 表中插入了一些規則來幫助我跟踪數據包:

[root@network3] ~# iptables -t raw -nvL
Chain PREROUTING (policy ACCEPT 3765 packets, 227K bytes)
pkts bytes target     prot opt in     out     source               destination
   0     0 TRACE      all  --  *      *       51.38.166.167        0.0.0.0/0
 185 12988 TRACE      all  --  *      *       0.0.0.0/0            51.38.166.167

Chain OUTPUT (policy ACCEPT 7941 packets, 837K bytes)
pkts bytes target     prot opt in     out     source               destination

從 VM1 測試:

ubuntu@test-1:~$ ip a l dev ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP group default qlen 1000
   link/ether fa:16:3e:51:0a:0b brd ff:ff:ff:ff:ff:ff
   inet 10.38.166.167/24 brd 10.38.166.255 scope global ens3
      valid_lft forever preferred_lft forever
   inet6 fe80::f816:3eff:fe51:a0b/64 scope link
      valid_lft forever preferred_lft forever

ubuntu@test-1:~$ curl ifconfig.co
51.38.166.167

ubuntu@test-1:~$ ping 51.38.166.166 -c 4
PING 51.38.166.166 (51.38.166.166) 56(84) bytes of data.

--- 51.38.166.166 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3031ms

從 VM2 測試:

ubuntu@test-2:~$ ip a l dev ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP group default qlen 1000
   link/ether fa:16:3e:9d:79:ce brd ff:ff:ff:ff:ff:ff
   inet 10.38.166.166/24 brd 10.38.166.255 scope global ens3
      valid_lft forever preferred_lft forever
   inet6 fe80::f816:3eff:fe9d:79ce/64 scope link
      valid_lft forever preferred_lft forever

ubuntu@test-2:~$ curl ifconfig.co
51.38.166.166

ubuntu@test-2:~$ ping 51.38.166.167 -c 4
PING 51.38.166.167 (51.38.166.167) 56(84) bytes of data.

--- 51.38.166.167 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3023ms

來自 network3 的日誌:

[root@network3] ~# tail -f /var/log/kern.log | grep "SRC=10.38.166.166 DST=51.38.166.167"
Jul  5 11:58:12 network3 kernel: [79540.314496] TRACE: nat:PREROUTING:rule:32 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49094 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=57
Jul  5 11:58:13 network3 kernel: [79541.322501] TRACE: raw:PREROUTING:policy:3 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49203 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=58
Jul  5 11:58:13 network3 kernel: [79541.322543] TRACE: mangle:PREROUTING:policy:1 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49203 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=58
Jul  5 11:58:13 network3 kernel: [79541.322574] TRACE: nat:PREROUTING:rule:32 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49203 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=58
Jul  5 11:58:14 network3 kernel: [79542.330582] TRACE: raw:PREROUTING:policy:3 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
Jul  5 11:58:14 network3 kernel: [79542.330615] TRACE: mangle:PREROUTING:policy:1 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
Jul  5 11:58:14 network3 kernel: [79542.330639] TRACE: nat:PREROUTING:rule:32 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
^C

由於給定 SEQ 的 ID 不會更改,因此我可以在日誌中搜尋有關此 ID/SEQ 的任何內容:

[root@network3] ~# grep "ID=49367" /var/log/kern.log
Jul  5 11:58:14 network3 kernel: [79542.330582] TRACE: raw:PREROUTING:policy:3 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
Jul  5 11:58:14 network3 kernel: [79542.330615] TRACE: mangle:PREROUTING:policy:1 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
Jul  5 11:58:14 network3 kernel: [79542.330639] TRACE: nat:PREROUTING:rule:32 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59

如果我參考這張圖:http: //inai.de/images/nf-packet-flow.png

它似乎停留在路由決定上。(我已經放棄了陷入橋接決策的可能性,因為如果我在不涉及任何橋接的情況下做完全相同的事情,那將是完全相同的行為)。

另一種可能性是它匹配 NAT 預路由規則 32 但不應用它,但我不知道為什麼。

在那種情況下我缺少什麼線索?

在路由決策中丟棄數據包的最常見原因是rp_filter.

檢查命令的輸出ip route get 51.38.166.167 from 10.38.166.166 iif br-ext。在正常情況下,它應該返回一個有效的路線。結果invalid cross-device link意味著數據包將被rp_filter. 還要檢查nstat -az TcpExtIPReversePathFilter. 它是此類丟棄數據包的計數器。

rp_filter檢查使用ip netconf show dev br-ext命令的目前模式。

使用sysctl命令調整此參數。

引用自:https://serverfault.com/questions/974057