Monit

如何禁用 Monit 實例啟動/停止警報?

  • May 20, 2016

每次 monit 守護程序停止或啟動時,Monit 都會發送警報。這是令人討厭且無用的資訊。

根據文件,我設置:

set alert user@mycompany.com but not on { instance }

…應該向該電子郵件發送警報,除非它們屬於“實例”類別,該類別定義為啟動/停止。

但是,我仍然會收到警報。這太煩人了。顯然我一定是錯過了什麼。

我們正在執行 Monit 5.2.4

根據文件,Monit 可以生成許多警報:

Event:     | Failure state:              | Success state:
---------------------------------------------------------------------
action     | "Action done"               | "Action done"
checksum   | "Checksum failed"           | "Checksum succeeded"
bytein     | "Download bytes exceeded"   | "Download bytes ok"
byteout    | "Upload bytes exceeded"     | "Upload bytes ok"
connection | "Connection failed"         | "Connection succeeded"
content    | "Content failed",           | "Content succeeded"
data       | "Data access error"         | "Data access succeeded"
exec       | "Execution failed"          | "Execution succeeded"
fsflags    | "Filesystem flags failed"   | "Filesystem flags succeeded"
gid        | "GID failed"                | "GID succeeded"
icmp       | "Ping failed"               | "Ping succeeded"
instance   | "Monit instance changed"    | "Monit instance changed not"
invalid    | "Invalid type"              | "Type succeeded"
link       | "Link down"                 | "Link up"
nonexist   | "Does not exist"            | "Exists"
packetin   | "Download packets exceeded" | "Download packets ok"
packetout  | "Upload packets exceeded"   | "Upload packets ok"
permission | "Permission failed"         | "Permission succeeded"
pid        | "PID failed"                | "PID succeeded"
ppid       | "PPID failed"               | "PPID succeeded"
resource   | "Resource limit matched"    | "Resource limit succeeded"
saturation | "Saturation exceeded"       | "Saturation ok"
size       | "Size failed"               | "Size succeeded"
speed      | "Speed failed"              | "Speed ok"
status     | "Status failed"             | "Status succeeded"
timeout    | "Timeout"                   | "Timeout recovery"
timestamp  | "Timestamp failed"          | "Timestamp succeeded"
uid        | "UID failed"                | "UID succeeded"
uptime     | "Uptime failed"             | "Uptime succeeded"

我們可以通過設置(更改地址以保護無辜者)來解決這個問題:

SET ALERT important-messages@projectlocker.com ON { invalid, nonexist, timeout, resource, size, timestamp}
SET ALERT less-important-messages@projectlocker.com ON {action, permission, pid, ppid, instance, status}

這成功地將消息路由到我們關心的地址。您可以在全域或本地設置它們,但我們的警報只是全域的。

SERVICE TESTS 下的子標題: http://mmonit.com/monit/documentation/monit.html 與上述類型相當整齊地對應。

對於伺服器的每個預定程序或功能,您應該能夠用簡單的英語提出對您而言重要的事情,並將該願望與 SERVICE TESTS 中提到的測試之一相匹配。例如,如果我正在執行 Apache,我知道我關心的是:

  • PID 文件中的 PID 是否還在執行?(不存在)
  • PID 是否在我不知情的情況下發生了變化?(PID)
  • 服務是否及時響應重啟?(超時)

對於輪詢的自定義守護程序,我可能關心日誌文件是否定期更新狀態消息(時間戳)。

我正在使用 Monit 版本 5.2.5 並使用以下內容已停止通過監控警報

set alert example@gmail.com not {instance}

引用自:https://serverfault.com/questions/253263