Icinga 1 主機狀態不可訪問，但所有檢查均正常

June 22, 2017

這是分佈式 Icinga 1 環境。
我在 Icinga 1 客戶端/衛星上有大約 100 台主機，它們處於 UNREACHABLE 狀態。每個主機的所有四項檢查都返回 OK 狀態，但設備的整體狀態是 UNREACHABLE。
問題可能是由於我讓 Icinga 1 以錯誤的 /usr/lib64/nagios/plugins/check_icmp 權限執行所致。（check_icmp 沒有設置 suid 位。）
所以我停止了 Icinga 並清空了衛星上的狀態保留文件 (state_retention_file=/var/spool/icinga/retention.dat)，但這並沒有幫助。如果我清空主機上的同一個文件可能會有所幫助嗎？
ps 顯示我的 submit_check_result.sh submit_host_check.sh 腳本作為殭屍執行，但它們的壽命不長。

這似乎已經解決了這個問題。

貓 /etc/icinga/scripts/submit_check_result.sh

return_code=-1

case "$3" in
   OK)
       return_code=0
       ;;
   WARNING)
       return_code=1
       ;;
   CRITICAL)
       return_code=2
       ;;
   UNKNOWN)
       return_code=-1
       ;;
esac

# pipe the service check info into the send_nsca program, which
# in turn transmits the data to the nsca daemon on the central
# monitoring server
# submit to master Icinga den-mon-prod

/usr/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" | /usr/sbin/send_nsca -H 111.14.219.31 -c /etc/nagios/send_nsca.cfg &

貓 /etc/icinga/scripts/submit_host_check.sh

return_code=-1

case "$2" in
   UP)
       return_code=0
       ;;
   DOWN)
       return_code=1
       ;;
   DOWN)
       return_code=2
       ;;
   UNREACHABLE)
       return_code=3
       ;;
esac

/usr/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" | /usr/sbin/send_nsca -H 111.14.219.31 -c /etc/nagios/send_nsca.cfg &

我不得不在客戶端恢復我的支票轉發腳本。

以下是破損的部分。

# BEGIN submit_check_result.sh
##############################

return_code=-1

case "$3" in
   OK)
       return_code=0
       ;;
   WARNING)
       return_code=1
       ;;
   CRITICAL)
       return_code=2
       ;;
   CRITICAL)
       return_code=2
       ;;
esac
/usr/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" | /usr/sbin/send_nsca -H 111.14.219.31 -c /etc/nagios/send_nsca.cfg &
# END Check_result

##############################

BEGIN submit_host_result.sh

##############################

return_code=2

case "$3" in
   OK)
       return_code=0
       ;;
   WARNING)
       return_code=1
       ;;
   CRITICAL)
       return_code=2
       ;;
   UNKNOWN)
       return_code=2
       ;;
esac

END Check_host
##############################

引用自：https://serverfault.com/questions/857334

Icinga 1 主機狀態不可訪問，但所有檢查均正常

相關問答

如何在 Icinga/Nagios 中進行持久確認？

使用 icinga 遠端代理執行遠端檢查命令

讓代理節點顯示在 icingaweb2 的主節點上

遠端 Icinga 實例“xxx”未連接到“zzzz”

目前沒有寫入 IDO 的 icinga 實例

如何在午夜設置 icinga2 停機時間