Monitoring
Munin 磁碟延遲警報
我已經設置了我的 Munin 伺服器和警報,並對其進行了測試。我已將磁碟使用警報設置如下:
df._dev_mapper_centos_root.warning 90 df._dev_md126p2.warning 90 df._dev_md126p1.warning 90 df._dev_mapper_centos_home.warning 90
我在電子郵件中收到了上述警報(為了測試我保留了較低的值):
> sha :: Server2 :: Disk usage in percent > WARNINGs: /boot is 33.48 (outside range [:33]), / is 17.95 (outside range [:17]), /boot/efi is 4.73 (outside range [:4]). > > sha :: Server1 :: Disk usage in percent > OKs: /boot is 33.48, / is 17.95, /boot/efi is 4.73
我現在面臨的問題是我收到了磁碟延遲警報,但我找不到任何值來更改警報。以下是 Munin 觸發的幾個警報:
> sha :: Server1 :: Disk latency per device :: Average latency > for /dev/centos/swap > WARNINGs: Write IO Wait time is 4.89 (outside range [0:3]). > > sha :: Server1 :: Disk latency per device :: Average latency > for /dev/centos/home > WARNINGs: Write IO Wait time is 10.64 (outside range [0:3])
.
即使此伺服器存在每個設備的磁碟延遲圖表,但是當我遠端登錄到節點時,我沒有得到任何外掛來獲取值:
telnet 192.168.10.252 4949 Trying 192.168.10.252... Connected to 192.168.10.252. Escape character is '^]'. # munin node at localhost.localdomain list acpi cpu df df_inode entropy exim_mailqueue forks fw_conntrack fw_forwarded_local fw_packets hddtemp_smartctl if_enp2s0 if_err_enp2s0 interrupts irqstats load memory netstat open_files open_inodes postfix_mailqueue proc_pri processes swap threads uptime users vmstat
我不確定我是否解釋得當,如果你認為這是一個愚蠢的問題,我很抱歉。我只想完全停止這些警報或將值設置為高。我希望我能在這裡得到一些幫助。
它可能是diskstats_latency外掛,請嘗試以下操作:
diskstats_latency.centos_home.avgwrwait.warning 0:15 diskstats_latency.centos_home.avgrdwait.warning 0:15 diskstats_latency.centos_swap.avgwrwait.warning 0:15 diskstats_latency.centos_swap.avgrdwait.warning 0:15
請注意,這適用於寫入 ( avgwrwait ) 和讀取 ( avgrdwait ) 延遲。
我將範圍設置為 0:15,這幾乎可以完全禁用您想要的警告。
不要忘記重新啟動 munin 守護程序
systemctl restart munin-node