Centos

2 台伺服器在 DHCP 伺服器出現故障時表現出不同的行為

  • November 23, 2020

我對網路上的 DHCP 伺服器沒有任何控制權,每兩週左右就會出現幾個小時的中斷,我的 CentOS 7.8 伺服器沒有得到對 DHCP 續訂請求的響應。據我所知,這些伺服器的配置完全相同。在此中斷期間,一些伺服器會不斷請求 DHCP,直到 DHCP 更新成功並且系統重新連接到網路上。但是,一些伺服器似乎遇到了一些極端情況並在一段時間後停止了 DHCP 請求,然後再也沒有回到網路上。當看到我發布的日誌中的差異時,任何人都可以告知發生了什麼不同。

server003 是一個失敗案例

server004 是一個很好的案例 謝謝!

我看到的一些奇怪是在失敗的 server003 上

“綁定:在 2134686840 秒內續訂”

“嘗試記錄的租約 192.168.2.72”

192.168.2.72 是我們曾經使用的一個非常舊的網路是 dhclient 實際上在介面上設置了這個 IP 嗎?

server003日誌:

Nov 18 07:01:41 got DHCP
Nov 18 07:21:02 something killed, MFE?
Nov 18 09:00:09 DHCP started failing
Nov 18 09:00:09 server003 dhclient[45214]: DHCPREQUEST on enp4s0 to 10.20.193.131 port 67 (xid=0x44d64e6c)
-- DHCPREQUEST on enp4s0 repeatedly till 12:01 --
Nov 18 12:01:27 server003 dhclient[45214]: DHCPREQUEST on enp4s0 to 255.255.255.255 port 67 (xid=0x44d64e6c)
Nov 18 12:01:41 server003 avahi-daemon[1973]: Withdrawing address record for 10.20.232.222 on enp4s0.
Nov 18 12:01:41 server003 avahi-daemon[1973]: Leaving mDNS multicast group on interface enp4s0.IPv4 with address 10.20.232.222.
Nov 18 12:01:41 server003 avahi-daemon[1973]: Interface enp4s0.IPv4 no longer relevant for mDNS.
Nov 18 12:01:42 server003 NetworkManager[2553]: <info>  [1605718902.1357] dhcp4 (enp4s0): state changed bound -> expire
Nov 18 12:01:42 server003 NetworkManager[2553]: <info>  [1605718902.1364] device (enp4s0): DHCPv4: 480 seconds grace period started
Nov 18 12:01:42 server003 NetworkManager[2553]: <info>  [1605718902.1469] dhcp4 (enp4s0): state changed expire -> unknown
Nov 18 12:02:41 server003 dhclient[45214]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 2 (xid=0x40f44748)
Nov 18 12:02:43 server003 dhclient[45214]: No DHCPOFFERS received.
Nov 18 12:02:43 server003 dhclient[45214]: Trying recorded lease 192.168.2.72
Nov 18 12:02:43 server003 NetworkManager[2553]: <info>  [1605718963.8053] dhcp4 (enp4s0): state changed unknown -> timeout
Nov 18 12:02:43 server003 dhclient[45214]: bound: renewal in 2134686840 seconds.
Nov 18 12:09:42 server003 NetworkManager[2553]: <info>  [1605719382.2119] device (enp4s0): DHCPv4: grace period expired
Nov 18 13:06:02 server003 NetworkManager[2553]: <info>  [1605722762.3311] policy: set 'enp4s0' (enp4s0) as default for IPv6 routing and DNS
-- nothing else after this --

server004 日誌

Nov 18 07:27:10 got DHCP
Nov 18 09:26:55 DHCP started failing
Nov 18 09:26:55 server004 dhclient[5179]: DHCPREQUEST on enp4s0 to 10.20.193.131 port 67 (xid=0x26458456)
-- DHCPREQUEST on enp4s0 repeatedly till 12:27 --
Nov 18 12:27:04 server004 dhclient[5179]: DHCPREQUEST on enp4s0 to 255.255.255.255 port 67 (xid=0x26458456)
Nov 18 12:27:10 server004 avahi-daemon[1869]: Withdrawing address record for 10.20.232.229 on enp4s0.
Nov 18 12:27:10 server004 avahi-daemon[1869]: Leaving mDNS multicast group on interface enp4s0.IPv4 with address 10.20.232.229.
Nov 18 12:27:10 server004 avahi-daemon[1869]: Interface enp4s0.IPv4 no longer relevant for mDNS.
Nov 18 12:27:11 server004 NetworkManager[2609]: <info>  [1605720431.3993] dhcp4 (enp4s0): state changed bound -> expire
Nov 18 12:27:11 server004 NetworkManager[2609]: <info>  [1605720431.4000] device (enp4s0): DHCPv4: 480 seconds grace period started
Nov 18 12:27:11 server004 NetworkManager[2609]: <info>  [1605720431.4106] dhcp4 (enp4s0): state changed expire -> unknown
Nov 18 12:27:11 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 5 (xid=0x1e6890d0)
Nov 18 12:27:16 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 9 (xid=0x1e6890d0)
Nov 18 12:27:25 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 10 (xid=0x1e6890d0)
Nov 18 12:27:35 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 15 (xid=0x1e6890d0)
Nov 18 12:27:50 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 15 (xid=0x1e6890d0)
Nov 18 12:28:05 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 7 (xid=0x1e6890d0)
Nov 18 12:28:12 server004 dhclient[5179]: No DHCPOFFERS received.
Nov 18 12:28:12 server004 dhclient[5179]: No working leases in persistent database - sleeping.
Nov 18 12:28:12 server004 NetworkManager[2609]: <info>  [1605720492.1971] dhcp4 (enp4s0): state changed unknown -> fail
Nov 18 12:32:08 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 8 (xid=0x47045c24)
Nov 18 12:32:16 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 13 (xid=0x47045c24)
Nov 18 12:32:29 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 20 (xid=0x47045c24)
Nov 18 12:32:49 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 14 (xid=0x47045c24)
Nov 18 12:33:03 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 6 (xid=0x47045c24)
Nov 18 12:33:09 server004 dhclient[5179]: No DHCPOFFERS received.
Nov 18 12:33:09 server004 dhclient[5179]: No working leases in persistent database - sleeping.
-- DHCPDISCOVER -> No DHCPOFFERS received -> DHCPDISCOVER happens repeatedly every 5 mins until 13:05 and then got DHCP
Nov 18 13:05:04 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 7 (xid=0x6a52e1ae)
Nov 18 13:05:11 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 14 (xid=0x6a52e1ae)
Nov 18 13:05:25 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 15 (xid=0x6a52e1ae)
Nov 18 13:05:40 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 17 (xid=0x6a52e1ae)
Nov 18 13:05:40 server004 dhclient[5179]: DHCPREQUEST on enp4s0 to 255.255.255.255 port 67 (xid=0x6a52e1ae)
Nov 18 13:05:40 server004 dhclient[5179]: DHCPOFFER from 10.20.232.1
Nov 18 13:05:40 server004 dhclient[5179]: DHCPACK from 10.20.232.1 (xid=0x6a52e1ae)
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0476] dhcp4 (enp4s0):   address 10.20.232.229
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   plen 22 (255.255.252.0)
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   gateway 10.20.232.1
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   lease time 18000
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   nameserver '10.20.10.49'
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   nameserver '10.20.10.48'
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0484] dhcp4 (enp4s0):   domain name 'company.com'
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0484] dhcp4 (enp4s0): state changed fail -> bound
Nov 18 13:05:40 server004 dhclient[5179]: bound to 10.20.232.229 -- renewal in 7548 seconds.
Nov 18 13:05:40 server004 avahi-daemon[1869]: Joining mDNS multicast group on interface enp4s0.IPv4 with address 10.20.232.229.
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0519] policy: set 'enp4s0' (enp4s0) as default for IPv4 routing and DNS
Nov 18 13:05:40 server004 avahi-daemon[1869]: New relevant interface enp4s0.IPv4 for mDNS.
Nov 18 13:05:40 server004 avahi-daemon[1869]: Registering new address record for 10.20.232.229 on enp4s0.IPv4.
Nov 18 13:05:40 server004 dbus[1896]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Nov 18 13:05:40 server004 systemd: Starting Network Manager Script Dispatcher Service...
Nov 18 13:05:40 server004 dbus[1896]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Nov 18 13:05:40 server004 systemd: Started Network Manager Script Dispatcher Service.
Nov 18 13:05:40 server004 nm-dispatcher: req:1 'dhcp4-change' [enp4s0]: new request (4 scripts)
Nov 18 13:05:40 server004 nm-dispatcher: req:1 'dhcp4-change' [enp4s0]: start running ordered scripts...
Nov 18 13:06:02 server004 NetworkManager[2609]: <info>  [1605722762.4182] policy: set 'enp4s0' (enp4s0) as default for IPv6 routing and DNS

您的 server003 記住了地址 192.168.2.72 的舊 DHCP 租約,到期時間很長,並且在 DHCP 伺服器不可用時回退到該租約。這甚至與它收到的最後一個合法 IP 地址 10.20.193.131 不在同一個網路上。

您應該清除該伺服器的 DHCP 租約,然後重新啟動 DHCP 客戶端。

在保持網路連結正常的同時,應該做這樣的事情(以 root 身份):

rm -f /var/lib/NetworkManager/*.lease; killall dhclient; nmcli device reapply enp4s0

引用自:https://serverfault.com/questions/1043644