Nagios
Nagios:NRPE:無法讀取輸出,找不到原因,可以嗎?
我有一個 Nagios 伺服器和一個受監控的伺服器。在受監控的伺服器上:
[root@Monitored ~]# netstat -an |grep :5666 tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN [root@Monitored ~]# locate check_kvm /usr/lib64/nagios/plugins/check_kvm [root@Monitored ~]# /usr/lib64/nagios/plugins/check_kvm -H localhost hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running [root@Monitored ~]# /usr/lib64/nagios/plugins/check_nrpe -H localhost -c check_kvm NRPE: Unable to read output [root@Monitored ~]# /usr/lib64/nagios/plugins/check_nrpe -H localhost NRPE v2.14 [root@Monitored ~]# ps -ef |grep nrpe nagios 21178 1 0 16:11 ? 00:00:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d [root@Monitored ~]#
在 Nagios 伺服器上:
[root@Nagios ~]# /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.159 -c check_kvm NRPE: Unable to read output [root@Nagios ~]# /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.159 NRPE v2.14 [root@Nagios ~]#
當我使用相同的命令檢查網路中的另一台伺服器時,它可以工作:
[root@Nagios ~]# /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.80 -c check_kvm hosts:4 OK:4 WARN:0 CRIT:0 - karmisoft:running ab2c4:running kidumim1:running travel2gether1:running [root@Nagios ~]#
使用 Nagios 帳戶在本地執行檢查:
[root@Monitored ~]# su - nagios -bash-4.1$ /usr/lib64/nagios/plugins/check_kvm hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running -bash-4.1$
使用 Nagios 帳戶從 Nagios 伺服器遠端執行檢查:
-bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.159 -c check_kvm NRPE: Unable to read output -bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.159 NRPE v2.14 -bash-4.1$
使用 Nagios 帳戶對網路中的不同伺服器執行相同的 check_kvm:
-bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.80 -c check_kvm hosts:4 OK:4 WARN:0 CRIT:0 - karmisoft:running ab2c4:running kidumim1:running travel2gether1:running -bash-4.1$
權限:
-rwxr-xr-x. 1 root root 4684 2013-10-14 17:14 nrpe.cfg (aka /etc/nagios/nrpe.cfg) drwxrwxr-x. 3 nagios nagios 4096 2013-10-15 03:38 plugins (aka /usr/lib64/nagios/plugins)
/etc/sudoers:
[root@Monitored ~]# grep -i requiretty /etc/sudoers #Defaults requiretty
iptables/selinux:
[root@Monitored xinetd.d]# service iptables status iptables: Firewall is not running. [root@Monitored xinetd.d]# service ip6tables status ip6tables: Firewall is not running. [root@Monitored xinetd.d]# grep disable /etc/selinux/config # disabled - No SELinux policy is loaded. SELINUX=disabled [root@Monitored xinetd.d]#
裡面的命令
/etc/nagios/nrpe.cfg
是:[root@Monitored ~]# grep kvm /etc/nagios/nrpe.cfg command[check_kvm]=sudo /usr/lib64/nagios/plugins/check_kvm
並且
nagios
使用者被添加到/etc/sudoers
:nagios ALL=(ALL) NOPASSWD:/usr/lib64/nagios/plugins/check_kvm nagios ALL=(ALL) NOPASSWD:/usr/lib64/nagios/plugins/check_nrpe
這
check_kvm
是一個shell腳本,看起來像這樣:#!/bin/sh LIST=$(virsh list --all | sed '1,2d' | sed '/^$/d'| awk '{print $2":"$3}') if [ ! "$LIST" ]; then EXITVAL=3 #Status 3 = UNKNOWN (orange) echo "Unknown guests" exit $EXITVAL fi OK=0 WARN=0 CRIT=0 NUM=0 for host in $(echo $LIST) do name=$(echo $host | awk -F: '{print $1}') state=$(echo $host | awk -F: '{print $2}') NUM=$(expr $NUM + 1) case "$state" in running|blocked) OK=$(expr $OK + 1) ;; paused) WARN=$(expr $WARN + 1) ;; shutdown|shut*|crashed) CRIT=$(expr $CRIT + 1) ;; *) CRIT=$(expr $CRIT + 1) ;; esac done if [ "$NUM" -eq "$OK" ]; then EXITVAL=0 #Status 0 = OK (green) fi if [ "$WARN" -gt 0 ]; then EXITVAL=1 #Status 1 = WARNING (yellow) fi if [ "$CRIT" -gt 0 ]; then EXITVAL=2 #Status 2 = CRITICAL (red) fi echo hosts:$NUM OK:$OK WARN:$WARN CRIT:$CRIT - $LIST exit $EXITVAL
編輯(2013 年 10 月 22 日):在這一切之後,我現在可以從腳本中得到一些響應:
[root@Monitored ~]# /usr/lib64/nagios/plugins/check_nrpe -H localhost -c check_kvm Unknown guests [root@Monitored ~]# /usr/lib64/nagios/plugins/check_nrpe -H localhost NRPE v2.14 [root@Monitored ~]# /usr/lib64/nagios/plugins/check_kvm hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running [root@Monitored ~]# su - nagios -bash-4.1$ /usr/lib64/nagios/plugins/check_kvm hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running -bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H localhost -c check_kvm Unknown guests -bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H localhost NRPE v2.14
似乎問題與
check_nrpe
命令有關,或者與nrpe
伺服器上的安裝有關。2013 年 12 月 2 日編輯:對有問題的伺服器工作的其他檢查:
好詳細的書面Itai!您是否嘗試過降低配置的複雜性以查看它是否有效?
對於初學者,我首先將行更改
nrpe.cfg
為command[check_kvm]=/usr/lib64/nagios/plugins/check_kvm
並臨時將 /usr/lib64/nagios/plugins/check_kvm 腳本更改為非常簡單的內容,例如:
#!/bin/sh echo Hi exit 0
如果這行得通,那麼您可以開始增加複雜性。也許不是讓
nagios
使用者 sudo 訪問腳本,它確實需要訪問virsh
命令,您可以省略命令行sudo
中的部分。nrpe.cfg