linux路由錯誤？

January 9, 2019

一段時間以來，我一直在努力解決這個不容易重現的問題。我使用的是 linux kernel v3.1.0，有時路由到幾個 IP 地址不起作用。似乎發生的是，核心沒有將數據包發送到網關，而是將目標地址視為本地地址，並嘗試通過 ARP 獲取其 MAC 地址。
比如現在我目前的IP地址是172.16.1.104/24，網關是172.16.1.254：
# ifconfig eth0 eth0      Link encap:Ethernet  HWaddr 00:1B:63:97:FC:DC
         inet addr:172.16.1.104  Bcast:172.16.1.255  Mask:255.255.255.0
         UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
         RX packets:230772 errors:0 dropped:0 overruns:0 frame:0
         TX packets:171013 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:1000
         RX bytes:191879370 (182.9 Mb)  TX bytes:47173253 (44.9 Mb)
         Interrupt:17

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.16.1.254    0.0.0.0         UG    0      0        0 eth0
172.16.1.0      0.0.0.0         255.255.255.0   U     1      0        0 eth0
我可以 ping 幾個地址，但不能 172.16.0.59：
# ping -c1 172.16.1.254
PING 172.16.1.254 (172.16.1.254) 56(84) bytes of data.
64 bytes from 172.16.1.254: icmp_seq=1 ttl=64 time=0.383 ms

--- 172.16.1.254 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.383/0.383/0.383/0.000 ms
root@pozsybook:~# ping -c1 172.16.0.1
PING 172.16.0.1 (172.16.0.1) 56(84) bytes of data.
64 bytes from 172.16.0.1: icmp_seq=1 ttl=63 time=5.54 ms

--- 172.16.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 5.545/5.545/5.545/0.000 ms
root@pozsybook:~# ping -c1 172.16.0.2
PING 172.16.0.2 (172.16.0.2) 56(84) bytes of data.
64 bytes from 172.16.0.2: icmp_seq=1 ttl=62 time=7.92 ms

--- 172.16.0.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.925/7.925/7.925/0.000 ms
root@pozsybook:~# ping -c1 172.16.0.59
PING 172.16.0.59 (172.16.0.59) 56(84) bytes of data.
From 172.16.1.104 icmp_seq=1 Destination Host Unreachable

--- 172.16.0.59 ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
當嘗試 ping 172.16.0.59 時，我可以在 tcpdump 中看到發送了一個 ARP 請求：
# tcpdump -n -i eth0|grep ARP
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
15:25:16.671217 ARP, Request who-has 172.16.0.59 tell 172.16.1.104, length 28
並且 /proc/net/arp 的 172.16.0.59 條目不完整：
# grep 172.16.0.59 /proc/net/arp
172.16.0.59      0x1         0x0         00:00:00:00:00:00     *        eth0
請注意，172.16.0.59可以從其他電腦從該 LAN 訪問。
有誰知道發生了什麼？謝謝。
**更新：**回復以下評論：
除了 eth0 和 lo 沒有其他介面
在另一端看不到 ARP req，但這就是它應該如何工作的。主要問題是ARP請求甚至不應該首先發送
即使我使用命令“route add -host 172.16.0.59 gw 172.16.1.254 dev eth0”添加顯式路由，問題仍然存在

這確實是一個 linux 核心錯誤，可能從 2.6.39 版本開始。我已將問題發佈到 lkml 和 netdev 列表（請參閱https://lkml.org/lkml/2011/11/18/191上的執行緒），它只是在http://www的另一個 netdev 執行緒中討論過.spinics.net/lists/netdev/msg179687.html
目前的解決方案是重新啟動或刷新所有路由並等待 10 分鐘以使 icmp 重定向到期。為防止再次發生，
echo 0 &gt;/proc/sys/net/ipv4/conf/eth0/accept_redirects
有幫助。

172.16.XX 預設子網遮罩為 255.255.0.0，您已將其重新配置為 255.255.255.0 。所以主機 172.16.0.x 和 172.16.1.x 位於不同的子網上。因此它將嘗試通過預設網關路由它。
將子網遮罩更改為 255.255.0.0 即可解決問題。
能不能給個圖。如果你不能畫一個網路，它就不能被修復（老網路工程師諺語……我！）。
乾杯，

引用自：https://serverfault.com/questions/331561

linux路由錯誤？

相關問答

路由到其他子網會導致錯誤的 ARP 請求

通過一個網卡路由網際網路流量，通過第二個網卡路由本地流量

無法設置靜態主機路由 - “SIOCADDRT：沒有這樣的過程”

多個 VLAN - Linux / Debian 如何處理數據包

Linux 路由記憶體顯示高 RTT 用於環回 - 這正常嗎？

Linux，兩個互相連接的光纖介面——路由