Monitoring

Munin 磁碟延遲警報

  • June 20, 2017

我已經設置了我的 Munin 伺服器和警報,並對其進行了測試。我已將磁碟使用警報設置如下:

df._dev_mapper_centos_root.warning 90
df._dev_md126p2.warning 90
df._dev_md126p1.warning 90
df._dev_mapper_centos_home.warning 90

我在電子郵件中收到了上述警報(為了測試我保留了較低的值):

>  sha :: Server2 :: Disk usage in percent
>         WARNINGs: /boot is 33.48 (outside range [:33]), / is 17.95 (outside range [:17]), /boot/efi is 4.73 (outside range [:4]).
> 
> sha :: Server1 :: Disk usage in percent
>         OKs: /boot is 33.48, / is 17.95, /boot/efi is 4.73

我現在面臨的問題是我收到了磁碟延遲警報,但我找不到任何值來更改警報。以下是 Munin 觸發的幾個警報:

> sha :: Server1 :: Disk latency per device :: Average latency
> for /dev/centos/swap
>         WARNINGs: Write IO Wait time is 4.89 (outside range [0:3]).
> 
> sha :: Server1 :: Disk latency per device :: Average latency
> for /dev/centos/home
>         WARNINGs: Write IO Wait time is 10.64 (outside range [0:3])

.

即使此伺服器存在每個設備的磁碟延遲圖表,但是當我遠端登錄到節點時,我沒有得到任何外掛來獲取值:

telnet 192.168.10.252 4949
Trying 192.168.10.252...
Connected to 192.168.10.252.
Escape character is '^]'.
# munin node at localhost.localdomain
list
acpi cpu df df_inode entropy exim_mailqueue forks fw_conntrack 
fw_forwarded_local fw_packets hddtemp_smartctl if_enp2s0 if_err_enp2s0 
interrupts irqstats load memory netstat open_files open_inodes 
postfix_mailqueue proc_pri processes swap threads uptime users vmstat

我不確定我是否解釋得當,如果你認為這是一個愚蠢的問題,我很抱歉。我只想完全停止這些警報或將值設置為高。我希望我能在這裡得到一些幫助。

它可能是diskstats_latency外掛,請嘗試以下操作:

diskstats_latency.centos_home.avgwrwait.warning 0:15
diskstats_latency.centos_home.avgrdwait.warning 0:15
diskstats_latency.centos_swap.avgwrwait.warning 0:15
diskstats_latency.centos_swap.avgrdwait.warning 0:15

請注意,這適用於寫入 ( avgwrwait ) 和讀取 ( avgrdwait ) 延遲。

我將範圍設置為 0:15,這幾乎可以完全禁用您想要的警告。

不要忘記重新啟動 munin 守護程序

systemctl restart munin-node

引用自:https://serverfault.com/questions/849075