Linux

BIND 名稱伺服器的數據包隊列性能差異

  • March 17, 2013

背景:

我繼承了一個大容量記憶體名稱伺服器環境(Redhat Enterprise Linux 5.8,IBM System x3550),該環境具有不一致的環形緩衝區設置:eth0 為 1020,eth1 為 255。eth0 連接到其本地數據中心的交換機 1,eth1 連接到相同的交換機 2。集群中的每台伺服器在 eth0 或 eth1 是活動介面之間​​交替,並且每個集群都位於不同的區域。環形緩衝區顯然需要保持一致。

這就是事情變得更棘手的地方:我在研究為什麼許多名稱伺服器經常記錄“錯誤發送響應:未設置”錯誤時發現了上述問題,ISC 知識庫表明這與出站擁塞有關。具有較高環形緩衝區設置 (1020) 的伺服器在 ifconfig 上丟棄的數據包較少(正如人們所期望的那樣),但傾向於以很高的頻率記錄上述錯誤,在我的最高負載組之一中每天約 20k 次。我們將其稱為“第 1 組”。具有較低環形緩衝區 (255) 設置的伺服器每天丟棄更多的入站數據包(再次,預期),但 BIND 錯誤的實例要少得多,在同一負載組中通常為 0-150。

這裡也不是什麼大謎。記憶體 DNS 是一種遞歸服務:如果未記憶體某些內容,則伺服器必須代表該問題進行多次查詢,直到最終返回答案。這是一個(一進)->(多出)查詢關係。修復 RX 環形緩衝區應該會導致這個數字全面等於一個新值,並且從那裡調整核心在 proc 中的出站網路隊列(wmem_max/wmem_default)可能是一個好主意。


我喜歡能夠衡量配置更改對性能問題的影響,所以我寫了一份報告來收集一些數據,然後再開始進行生產更改。以下是第 1 組中前兩台伺服器的輸出範例:

group1-01
   RX: 7166.27/sec av.
   TX: 7432.57/sec av.
   RXDROP: 7.43/sec av.
   unset_err: 27633
group1-02
   RX: 7137.37/sec av.
   TX: 7398.50/sec av.
   RXDROP: 9.94/sec av.
   unset_err: 107

這些是公式。請注意,這是一個本地腳本,並且不依賴必須為每個伺服器維護的 shell 腳本。

   RXPACK=$(ssh $server "sar -n DEV -f /var/log/sa/sa$(date --date=yesterday '+%d') | grep \"Average: .*\$(awk '{if (\$2 == "00000000") { print \$1 }}' /proc/net/route)\" | awk '{print \$3}'" 2>/dev/null)
   TXPACK=$(ssh $server "sar -n DEV -f /var/log/sa/sa$(date --date=yesterday '+%d') | grep \"Average: .*\$(awk '{if (\$2 == "00000000") { print \$1 }}' /proc/net/route)\" | awk '{print \$4}'" 2>/dev/null)
   RXDROP=$(ssh $server "sar -n EDEV -f /var/log/sa/sa$(date --date=yesterday '+%d') | grep \"Average: .*\$(awk '{if (\$2 == "00000000") { print \$1 }}' /proc/net/route)\" | awk '{print \$6}'" 2>/dev/null)
   TXDROP=$(ssh $server "sudo grep 'error sending response: unset' /var/log/dns_named.1" 2>/dev/null | wc -l)

一旦我開始在所有記憶體 DNS 環境中執行此報告,我注意到另一個具有幾乎相同數據包負載的組,我們將其稱為組 2,完全沒有問題

group2-01
   RX: 7066.44/sec av.
   TX: 7345.95/sec av.
   RXDROP: 0.00/sec av.
   unset_err: 0
group2-02
   RX: 7019.18/sec av.
   TX: 7312.47/sec av.
   RXDROP: 0.00/sec av.
   unset_err: 0

問題:

為什麼 group2 會以這種方式執行而不需要進一步調整 RX 環形緩衝區或net.core.wmem_default/ net.core.wmem_max?無論如何,我都需要對環形緩衝區進行規範化,但在開始使用 /proc 中的 wmem 值之前,我想了解這裡還發生了什麼。

我唯一能想到的是隊列被應用程序清空的速度越來越快,但是網路堆棧調整併不是我有很多實踐經驗的東西,我想獲得第二意見。(我的眼睛盯著一些 ethtool 計數器名稱,我不會否認)

我已經排除了以下可能性。證明在除法器之後。

  • 環形緩衝區佈局是相同的。(group1 和 group2 的第一台伺服器配置相同,group1 和 group2 的第二台伺服器配置相同)
  • 預設網關佈局相同。
  • 網卡都是一樣的。(博通 BCM5708)
  • ethtool 報告的韌體版本是一樣的。(公元前 4.0.3 ipms 1.6.0)
  • sysctl -a兩個組的第一台伺服器和兩個組的第二台伺服器之間的輸出匹配。(不包括核心和 fs 部分)
  • 第 1 組和第 2 組中的伺服器總數相同。(10)

出於保密原因,我無法顯示原始的 named.conf 或用於排除資訊的 grep 過濾器。您必須相信我的話,以下配置參數在所有四台伺服器之間是恆定的:

   notify no;
   allow-transfer { none; };
   allow-recursion { any; };
   allow-query { any; };
   allow-query-cache { any; };
   recursive-clients 100000;
   max-cache-size 2G;
   max-ncache-ttl 900;

下面是大量的系統資訊。“hosthash”只是為了證明循環的每次迭代實際上都在訪問不同的伺服器,而沒有透露實際的主機名。

主機雜湊:

group1-1: dc78abcb154b74c87feecb3f35222263d40c028c
group1-2: 9fe491d58fd1e7d4e21e5bf10c164e4cf66e884b
group2-1: fc76bb3ee1ff580c6aba0d685713bb4145bd5fe3
group2-2: b7550c65d37622a131b1e47f066773defbb4d817

for server in $group1_1 $group1_2 $group2_1 $group2_2
do
   echo ____________________
   ssh $server "echo -en hosthash: \$(echo \$HOSTNAME | sha1sum)\\\n\\\n &&
        SARFILE=/var/log/sa/sa\$(date --date=yesterday '+%d') &&
        uname -srvmpio &&
        sudo /usr/sbin/dmidecode -s system-product-name
        dmesg | grep Broadcom &&
        head /proc/cpuinfo &&
        GWIF=\$(awk '{if (\$2 == 00000000) { print \$1 }}' /proc/net/route) &&
        sar -n DEV -f \$SARFILE | egrep '(IFACE|Average)' &&
        sar -n EDEV -f \$SARFILE | egrep '(IFACE|Average)' &&
        sudo /sbin/ethtool \$GWIF &&
        sudo /sbin/ethtool -i \$GWIF &&
        sudo /sbin/ethtool -g \$GWIF &&
        sudo /sbin/ethtool -c \$GWIF &&
        sudo /sbin/ethtool -S \$GWIF &&
        echo sysctl linecount: \$(sudo /sbin/sysctl -a | egrep -v '^(fs|kernel)' | wc -l) &&
        echo sysctl hash: \$(sudo /sbin/sysctl -a | egrep -v '^(fs|kernel)' | sha1sum)"
done

輸出:

____________________
hosthash: dc78abcb154b74c87feecb3f35222263d40c028c -

Linux 2.6.18-308.16.1.el5 #1 SMP Tue Sep 18 07:21:07 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
IBM System x3550 -[7978AC1]-
bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.11 (July 20, 2011)
eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem c8000000, IRQ 90, node addr 001a649db00e
eth1: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem ce000000, IRQ 177, node addr 001a649db010
cnic: Broadcom NetXtreme II CNIC Driver cnic v2.5.7 (July 20, 2011)
Broadcom NetXtreme II iSCSI Driver bnx2i v2.7.0.3 (Aug 04, 2011)
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Xeon(R) CPU           E5420  @ 2.50GHz
stepping    : 6
cpu MHz     : 2493.750
cache size  : 6144 KB
physical id : 0
siblings    : 4
12:00:01 AM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
Average:           lo   1269.15   1269.15 206600.39 206600.39      0.00      0.00      0.00
Average:         eth0   7166.27   7432.57 704051.80 2419779.42      0.00      0.00      0.94
Average:         eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:00:01 AM     IFACE   rxerr/s   txerr/s    coll/s  rxdrop/s  txdrop/s  txcarr/s  rxfram/s  rxfifo/s  txfifo/s
Average:           lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth0      0.00      0.00      0.00      7.43      0.00      0.00      0.00      0.00      0.00
Average:         eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
driver: bnx2
version: 2.1.11
firmware-version: bc 4.0.3 ipms 1.6.0
bus-info: 0000:04:00.0
Ring parameters for eth0:
Pre-set maximums:
RX:     2040
RX Mini:    0
RX Jumbo:   8160
TX:     255
Current hardware settings:
RX:     1020
RX Mini:    0
RX Jumbo:   0
TX:     255

Coalesce parameters for eth0:
Adaptive RX: off  TX: off
stats-block-usecs: 999936
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 18
rx-frames: 12
rx-usecs-irq: 18
rx-frames-irq: 2

tx-usecs: 80
tx-frames: 20
tx-usecs-irq: 18
tx-frames-irq: 2

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

NIC statistics:
    rx_bytes: 1505439501410
    rx_error_bytes: 0
    tx_bytes: 4672574845104
    tx_error_bytes: 0
    rx_ucast_packets: 15315548049
    rx_mcast_packets: 2035415
    rx_bcast_packets: 1101989
    tx_ucast_packets: 15505474251
    tx_mcast_packets: 40018
    tx_bcast_packets: 36019
    tx_mac_errors: 0
    tx_carrier_errors: 0
    rx_crc_errors: 0
    rx_align_errors: 0
    tx_single_collisions: 0
    tx_multi_collisions: 0
    tx_deferred: 0
    tx_excess_collisions: 0
    tx_late_collisions: 0
    tx_total_collisions: 0
    rx_fragments: 0
    rx_jabbers: 0
    rx_undersize_packets: 0
    rx_oversize_packets: 0
    rx_64_byte_packets: 92309552
    rx_65_to_127_byte_packets: 1243637891
    rx_128_to_255_byte_packets: 790117566
    rx_256_to_511_byte_packets: 127197337
    rx_512_to_1023_byte_packets: 168929387
    rx_1024_to_1522_byte_packets: 11591832
    rx_1523_to_9022_byte_packets: 0
    tx_64_byte_packets: 60586118
    tx_65_to_127_byte_packets: 1976738758
    tx_128_to_255_byte_packets: 2830395753
    tx_256_to_511_byte_packets: 157607989
    tx_512_to_1023_byte_packets: 1483716940
    tx_1024_to_1522_byte_packets: 406821340
    tx_1523_to_9022_byte_packets: 0
    rx_xon_frames: 0
    rx_xoff_frames: 0
    tx_xon_frames: 116422
    tx_xoff_frames: 134780
    rx_mac_ctrl_frames: 0
    rx_filtered_packets: 0
    rx_ftq_discards: 0
    rx_discards: 0
    rx_fw_discards: 14015105
sysctl linecount: 504
sysctl hash: dd6aab90d0fd9ae90742c5f812a78734e2f2ff1c -
____________________
hosthash: 9fe491d58fd1e7d4e21e5bf10c164e4cf66e884b -

Linux 2.6.18-308.16.1.el5 #1 SMP Tue Sep 18 07:21:07 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
IBM System x3550 -[7978EHU]-
bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.11 (July 20, 2011)
eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem c8000000, IRQ 90, node addr 001a6479655c
eth1: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem ce000000, IRQ 177, node addr 001a6479655e
cnic: Broadcom NetXtreme II CNIC Driver cnic v2.5.7 (July 20, 2011)
Broadcom NetXtreme II iSCSI Driver bnx2i v2.7.0.3 (Aug 04, 2011)
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Xeon(R) CPU           E5420  @ 2.50GHz
stepping    : 6
cpu MHz     : 2493.746
cache size  : 6144 KB
physical id : 0
siblings    : 4
12:00:01 AM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
Average:           lo   1261.04   1261.04 205548.08 205548.08      0.00      0.00      0.00
Average:         eth0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth1   7137.37   7398.50 702340.35 2409580.71      0.00      0.00      0.97
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:00:01 AM     IFACE   rxerr/s   txerr/s    coll/s  rxdrop/s  txdrop/s  txcarr/s  rxfram/s  rxfifo/s  txfifo/s
Average:           lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth1      0.00      0.00      0.00      9.94      0.00      0.00      0.00      0.00      0.00
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
driver: bnx2
version: 2.1.11
firmware-version: bc 4.0.3 ipms 1.6.0
bus-info: 0000:06:00.0
Ring parameters for eth1:
Pre-set maximums:
RX:     2040
RX Mini:    0
RX Jumbo:   8160
TX:     255
Current hardware settings:
RX:     255
RX Mini:    0
RX Jumbo:   0
TX:     255

Coalesce parameters for eth1:
Adaptive RX: off  TX: off
stats-block-usecs: 999936
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 18
rx-frames: 12
rx-usecs-irq: 18
rx-frames-irq: 2

tx-usecs: 80
tx-frames: 20
tx-usecs-irq: 18
tx-frames-irq: 2

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

NIC statistics:
    rx_bytes: 1501719289640
    rx_error_bytes: 0
    tx_bytes: 4654179094291
    tx_error_bytes: 0
    rx_ucast_packets: 15253610508
    rx_mcast_packets: 2108112
    rx_bcast_packets: 1136240
    tx_ucast_packets: 15438361249
    tx_mcast_packets: 40135
    tx_bcast_packets: 1721
    tx_mac_errors: 0
    tx_carrier_errors: 0
    rx_crc_errors: 0
    rx_align_errors: 0
    tx_single_collisions: 0
    tx_multi_collisions: 0
    tx_deferred: 0
    tx_excess_collisions: 0
    tx_late_collisions: 0
    tx_total_collisions: 0
    rx_fragments: 0
    rx_jabbers: 0
    rx_undersize_packets: 0
    rx_oversize_packets: 0
    rx_64_byte_packets: 92376678
    rx_65_to_127_byte_packets: 1183040190
    rx_128_to_255_byte_packets: 788176623
    rx_256_to_511_byte_packets: 126838328
    rx_512_to_1023_byte_packets: 168170816
    rx_1024_to_1522_byte_packets: 13350337
    rx_1523_to_9022_byte_packets: 0
    tx_64_byte_packets: 60806588
    tx_65_to_127_byte_packets: 1955234150
    tx_128_to_255_byte_packets: 2806601346
    tx_256_to_511_byte_packets: 154015585
    tx_512_to_1023_byte_packets: 1466206531
    tx_1024_to_1522_byte_packets: 405928513
    tx_1523_to_9022_byte_packets: 0
    rx_xon_frames: 0
    rx_xoff_frames: 0
    tx_xon_frames: 150648
    tx_xoff_frames: 173552
    rx_mac_ctrl_frames: 0
    rx_filtered_packets: 1
    rx_ftq_discards: 0
    rx_discards: 0
    rx_fw_discards: 19605427
sysctl linecount: 504
sysctl hash: 4626e3788c72e091487afe1e3a7cfd32278ab07d -
____________________
hosthash: fc76bb3ee1ff580c6aba0d685713bb4145bd5fe3 -

Linux 2.6.18-308.16.1.el5 #1 SMP Tue Sep 18 07:21:07 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
IBM System x3550 -[7978AC1]-
bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.11 (July 20, 2011)
eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem c8000000, IRQ 90, node addr 001a649dc68a
eth1: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem ce000000, IRQ 177, node addr 001a649dc68c
cnic: Broadcom NetXtreme II CNIC Driver cnic v2.5.7 (July 20, 2011)
Broadcom NetXtreme II iSCSI Driver bnx2i v2.7.0.3 (Aug 04, 2011)
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Xeon(R) CPU           E5420  @ 2.50GHz
stepping    : 6
cpu MHz     : 2493.750
cache size  : 6144 KB
physical id : 0
siblings    : 4
12:00:01 AM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
Average:           lo   1891.67   1891.67 266593.77 266593.77      0.00      0.00      0.00
Average:         eth0   7066.44   7345.95 730519.41 2215508.99      0.00      0.00      4.37
Average:         eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:00:01 AM     IFACE   rxerr/s   txerr/s    coll/s  rxdrop/s  txdrop/s  txcarr/s  rxfram/s  rxfifo/s  txfifo/s
Average:           lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
driver: bnx2
version: 2.1.11
firmware-version: bc 4.0.3 ipms 1.6.0
bus-info: 0000:04:00.0
Ring parameters for eth0:
Pre-set maximums:
RX:     2040
RX Mini:    0
RX Jumbo:   8160
TX:     255
Current hardware settings:
RX:     1020
RX Mini:    0
RX Jumbo:   0
TX:     255

Coalesce parameters for eth0:
Adaptive RX: off  TX: off
stats-block-usecs: 999936
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 18
rx-frames: 12
rx-usecs-irq: 18
rx-frames-irq: 2

tx-usecs: 80
tx-frames: 20
tx-usecs-irq: 18
tx-frames-irq: 2

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

NIC statistics:
    rx_bytes: 4640887074833
    rx_error_bytes: 0
    tx_bytes: 12640942400790
    tx_error_bytes: 0
    rx_ucast_packets: 46405845860
    rx_mcast_packets: 14487857
    rx_bcast_packets: 3476467
    tx_ucast_packets: 47159091638
    tx_mcast_packets: 118147
    tx_bcast_packets: 5504
    tx_mac_errors: 0
    tx_carrier_errors: 0
    rx_crc_errors: 0
    rx_align_errors: 0
    tx_single_collisions: 0
    tx_multi_collisions: 0
    tx_deferred: 0
    tx_excess_collisions: 0
    tx_late_collisions: 0
    tx_total_collisions: 0
    rx_fragments: 0
    rx_jabbers: 0
    rx_undersize_packets: 0
    rx_oversize_packets: 0
    rx_64_byte_packets: 136463411
    rx_65_to_127_byte_packets: 4245502343
    rx_128_to_255_byte_packets: 2357984838
    rx_256_to_511_byte_packets: 355610202
    rx_512_to_1023_byte_packets: 608223572
    rx_1024_to_1522_byte_packets: 65320154
    rx_1523_to_9022_byte_packets: 0
    tx_64_byte_packets: 112166114
    tx_65_to_127_byte_packets: 3010346100
    tx_128_to_255_byte_packets: 4087240164
    tx_256_to_511_byte_packets: 1625596725
    tx_512_to_1023_byte_packets: 3037109096
    tx_1024_to_1522_byte_packets: 927187571
    tx_1523_to_9022_byte_packets: 0
    rx_xon_frames: 0
    rx_xoff_frames: 0
    tx_xon_frames: 79164
    tx_xoff_frames: 89685
    rx_mac_ctrl_frames: 0
    rx_filtered_packets: 1
    rx_ftq_discards: 0
    rx_discards: 0
    rx_fw_discards: 6857729
sysctl linecount: 504
sysctl hash: dd6aab90d0fd9ae90742c5f812a78734e2f2ff1c -
____________________
hosthash: b7550c65d37622a131b1e47f066773defbb4d817 -

Linux 2.6.18-308.16.1.el5 #1 SMP Tue Sep 18 07:21:07 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
IBM System x3550 -[7978EHU]-
bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.11 (July 20, 2011)
eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem c8000000, IRQ 90, node addr 00215e3f1ec4
eth1: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem ce000000, IRQ 177, node addr 00215e3f1ec6
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Xeon(R) CPU           E5420  @ 2.50GHz
stepping    : 6
cpu MHz     : 2493.753
cache size  : 6144 KB
physical id : 1
siblings    : 4
12:00:01 AM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
Average:           lo   1883.04   1883.04 263726.79 263726.79      0.00      0.00      0.00
Average:         eth0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth1   7019.18   7312.47 720911.92 2214861.10      0.00      0.00      1.02
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:00:01 AM     IFACE   rxerr/s   txerr/s    coll/s  rxdrop/s  txdrop/s  txcarr/s  rxfram/s  rxfifo/s  txfifo/s
Average:           lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:         sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
driver: bnx2
version: 2.1.11
firmware-version: bc 4.0.3 ipms 1.6.0
bus-info: 0000:06:00.0
Ring parameters for eth1:
Pre-set maximums:
RX:     2040
RX Mini:    0
RX Jumbo:   8160
TX:     255
Current hardware settings:
RX:     255
RX Mini:    0
RX Jumbo:   0
TX:     255

Coalesce parameters for eth1:
Adaptive RX: off  TX: off
stats-block-usecs: 999936
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 18
rx-frames: 12
rx-usecs-irq: 18
rx-frames-irq: 2

tx-usecs: 80
tx-frames: 20
tx-usecs-irq: 18
tx-frames-irq: 2

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

NIC statistics:
    rx_bytes: 4621548539323
    rx_error_bytes: 0
    tx_bytes: 12598031299743
    tx_error_bytes: 0
    rx_ucast_packets: 46260356368
    rx_mcast_packets: 5352446
    rx_bcast_packets: 3474589
    tx_ucast_packets: 47008853953
    tx_mcast_packets: 118164
    tx_bcast_packets: 5471
    tx_mac_errors: 0
    tx_carrier_errors: 0
    rx_crc_errors: 0
    rx_align_errors: 0
    tx_single_collisions: 0
    tx_multi_collisions: 0
    tx_deferred: 0
    tx_excess_collisions: 0
    tx_late_collisions: 0
    tx_total_collisions: 0
    rx_fragments: 0
    rx_jabbers: 0
    rx_undersize_packets: 0
    rx_oversize_packets: 0
    rx_64_byte_packets: 126851062
    rx_65_to_127_byte_packets: 4117708205
    rx_128_to_255_byte_packets: 2346047550
    rx_256_to_511_byte_packets: 356266112
    rx_512_to_1023_byte_packets: 604666332
    rx_1024_to_1522_byte_packets: 62938478
    rx_1523_to_9022_byte_packets: 0
    tx_64_byte_packets: 111216848
    tx_65_to_127_byte_packets: 2984505931
    tx_128_to_255_byte_packets: 4027485330
    tx_256_to_511_byte_packets: 1577669672
    tx_512_to_1023_byte_packets: 3015060448
    tx_1024_to_1522_byte_packets: 933575954
    tx_1523_to_9022_byte_packets: 0
    rx_xon_frames: 0
    rx_xoff_frames: 0
    tx_xon_frames: 129873
    tx_xoff_frames: 145090
    rx_mac_ctrl_frames: 0
    rx_filtered_packets: 1
    rx_ftq_discards: 0
    rx_discards: 0
    rx_fw_discards: 6752713
sysctl linecount: 504
sysctl hash: 4626e3788c72e091487afe1e3a7cfd32278ab07d -

即使您確定您的伺服器擁有完整的負載平衡器 VIP 列表,也仍然要執行數據包擷取。*僅僅因為您的機器不會響應 IP 地址的 ARP 並不意味著無法向其發送虛假數據包。*確保發送到您的 MAC 地址的流量與配置的 IP 地址相匹配。

我很感激人們花時間在這個問題上,但我自己的盡職調查在這裡缺乏。事後看來,我需要像這樣建構一個 PCAP 過濾器:

tcpdump -i eth0 -n 'ether dst aa:bb:cc:dd:ee:ff and not (dst host 1.2.3.4 or dst host 5.6.7.8 or...)'

在哪裡:

aa:bb:cc:dd:ee:ff = HW addr of eth0
1.2.3.4, 5.6.7.8  = list of destination addresses that traffic is expected on

有許多負載均衡器 VIP 沒有提供給我(我不控制 LB),它們在 TCP 埠 53 上傳遞流量的方式會導致 RX 丟棄。這些舊 IP 上的流量非常低,以至於管理員監視網路上的流量不太可能注意到它。

想知道這個盒子是不是戴爾的?戴爾提供的 bnx2i 驅動程序和晶片組存在一個眾所周知的問題。結果是在繁重的網路負載下隨機丟棄數據包。如果是這種情況,調整後的環形緩衝區可以觸發它似乎是合乎邏輯的。

我相信戴爾提供了他們自己的驅動程序版本作為修復。另一個修復方法是在 modprobe.conf 中執行類似的操作:

選項 bnx2i disable_msi=1

無論如何,試一試也無妨。和 x2 kce 所說的。我在這裡見過的最好的書面問題之一。

引用自:https://serverfault.com/questions/463836