Nagios

Nagios 通知定義

  • February 3, 2016

我正在嘗試以我想通過 http 在頁面上搜尋特定字元串的方式監視 Web 伺服器。該命令在 command.cfg 中定義如下

# 'check_http-mysite command definition'
define command {
       command_name check_http-mysite
       command_line /usr/lib/nagios/plugins/check_http -H mysite.example.com -s "Some text" }

# 'notify-host-by-sms' command definition
define command {
       command_name  notify-host-by-sms 
       command_line  /usr/bin/send_sms $CONTACTPAGER$ "Nagios - $NOTIFICATIONTYPE$ :Host$HOSTALIAS$ is $HOSTSTATE$ ($OUTPUT$)"
}
# 'notify-service-by-sms' command definition
define command {
       command_name  notify-service-by-sms 
       command_line  /usr/bin/send_sms $CONTACTPAGER$ "Nagios - $NOTIFICATIONTYPE$: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ ($OUTPUT$)"
}

現在,如果 nagios 在首頁 mysite.example.com 上找不到“某些文本”,nagios 應該通過 Clickatell http API 通過簡訊通知聯繫人,我有一個腳本,我已經測試過它並發現它工作正常.

每當我更改命令定義以搜尋不在頁面上的字元串並重新啟動 nagios 時,我可以在 Web 界面上看到未找到該字元串。我不明白的是,雖然我已經定義了主機、主機組、聯繫人、聯繫人組和服務等,但為什麼沒有發送通知。我錯過了什麼,這些是我的定義,在我通過 cgi 進行的 Web 訪問中,我可以看到我已經定義並啟用了通知,儘管在硬狀態更改期間我沒有收到電子郵件和簡訊通知。

主機配置文件

define host {
       use                     generic-host
       host_name               HAL
       alias                   IBM-1
       address                 xxx.xxx.xxx.xxx
       check_command           check_http-mysite     
}

hostgroups_nagios2.cfg

# my website
define hostgroup{
      hostgroup_name  my-servers
      alias           All My Servers
      members         HAL 
}

聯繫人_nagios2.cfg

define contact {
       contact_name                    colin   
       alias                           Colin Y
       service_notification_period     24x7
       host_notification_period        24x7
       service_notification_options    w,u,c,r,f,s
       host_notification_options       d,u,r,f,s
       service_notification_commands   notify-service-by-email,notify-service-by-sms
       host_notification_commands      notify-host-by-email,notify-host-by-sms
       email                           myaccount@mysite.com
       pager                           +254xxxxxxxxx
}

define contactgroup{
       contactgroup_name   site_admin 
       alias               Site Administrator
       members             colin 
}

services_nagios2.cfg

# check for particular string in page via http 
define service {
       hostgroup_name                  my-servers
       service_description             STRING CHECK
       check_command                   check_http-mysite
       use                             generic-service
       notification_interval           0 ; set > 0 if you want to be renotified
       contacts                        colin
       contact_groups                  site_admin
}

有人可以告訴我哪裡出錯了。

以下是 generic-host 和 generic-service 定義

generic-service_nagios2.cfg

# generic service template definition
define service{
       name                            generic-service ; The 'name' of this service template
       active_checks_enabled           1       ; Active service checks are enabled
       passive_checks_enabled          1       ; Passive service checks are enabled/accepted
       parallelize_check               1       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
       obsess_over_service             1       ; We should obsess over this service (if necessary)
       check_freshness                 0       ; Default is to NOT check service 'freshness'
       notifications_enabled           1       ; Service notifications are enabled
       event_handler_enabled           1       ; Service event handler is enabled
       flap_detection_enabled          1       ; Flap detection is enabled
       failure_prediction_enabled      1       ; Failure prediction is enabled
       process_perf_data               1       ; Process performance data
       retain_status_information       1       ; Retain status information across program restarts
       retain_nonstatus_information    1       ; Retain non-status information across program restarts
               notification_interval           0               ; Only send notifications on status change by default.
               is_volatile                     0
               check_period                    24x7
               normal_check_interval           5
               retry_check_interval            1
               max_check_attempts              4
               notification_period             24x7
               notification_options            w,u,c,r
               contact_groups                  site_admin
       register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}

generic-host_nagios2.cfg

define host{
       name                            generic-host    ; The name of this host template
       notifications_enabled           1       ; Host notifications are enabled
       event_handler_enabled           1       ; Host event handler is enabled
       flap_detection_enabled          1       ; Flap detection is enabled
       failure_prediction_enabled      1       ; Failure prediction is enabled
       process_perf_data               1       ; Process performance data
       retain_status_information       1       ; Retain status information across program restarts
       retain_nonstatus_information    1       ; Retain non-status information across program restarts
               max_check_attempts              10
               notification_interval           0
               notification_period             24x7
               notification_options            d,u,r
               contact_groups                  site_admin 
       register                        1       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

我想通了,實際上配置沒問題,問題是 nagios 以使用者 ’nagios’ 的身份執行 SMS 腳本,該使用者無權寫入 /tmp/ 中的日誌文件。但這在我閱讀的有關通過 SMS 設置 nagios 通知的任何部落格中都沒有解釋。我有點不得不自己找出來,幾乎讓我的頭爆炸。

引用自:https://serverfault.com/questions/102590