Nagios - check_ntp_time - 偏移量未知

January 7, 2019

我在子網上執行了一個本地 NTP 伺服器，以保持其他子網節點同步，而不是每個節點都與上游伺服器同步。但是，在check_ntp_time為 Nagios 實現外掛時，我注意到一個令人沮喪的問題，nagios 不斷報告本地節點與本地 ntp 伺服器同步的嚴重錯誤。
這是本地 ntp 伺服器上的 ntp 配置，請注意上游伺服器條目和限制條目，根據我的研究，這使節點成為本地節點可以同步的 ntp 伺服器。
driftfile /var/lib/ntp/drift

# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default kod limited nomodify notrap nopeer noquery
restrict -6 default kod limited nomodify notrap nopeer noquery

# Permit all access over the loopback interface.  This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1
restrict -6 ::1

# Makes me able to answer requests from local nodes
restrict 10.0.0.0 mask 255.255.192.0 nomodify notrap

# My source
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org
server 2.centos.pool.ntp.org

logfile /var/log/ntp/server.log

statistics loopstats
statsdir /var/log/ntp/
filegen peerstats file peers type day link enable
filegen loopstats file loops type day link enable
而在本地非ntp伺服器節點上，除了restrict項被刪除之外，一切都是一樣的，伺服器項只引用本地ntp伺服器：server ntp.example.com iburst。
每個本地節點都可以解析ntp.example.com.
我遇到的問題是當我從 nagios 伺服器執行以下命令時：
/usr/lib64/nagios/plugins/check_ntp_time -H node-a-1 -v
和輸出：
sending request to peer 0
response from peer 0: offset -0.002921819687
sending request to peer 0
response from peer 0: offset -0.0001939535141
sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
discarding peer 0: stratum=0
overall average offset: 0
NTP CRITICAL: Offset unknown|  
這發生在所有節點上，除了本地 ntp 伺服器，它引用上游伺服器。起初我以為是 IPTables 問題，但我在每個本地 ntp 節點上都有埠針孔（以允許 nagios 訪問以檢查時間差異）：
ACCEPT     udp  --  eth0   *       10.0.0.0/18          0.0.0.0/0           multiport dports 123 /* 777 allow ntp access */ state NEW
版本：
nagios-plugins-ntp: 1.4.16
ntp: 4.2.6p5-1.el6.centos
非常感謝任何幫助，在我解決這個問題之前我真的無法送出 nagios 工作，因為你知道保持伺服器時間同步是優先級 1。
- 編輯 -
根據評論，以下是ntpq -p, 在各個節點上的結果：
# Actual NTP Server (10.0.0.2)
==============================================================================
+propjet.latt.ne 241.199.164.101  2 u  105  128  337   14.578   12.954   7.138
+x2la01.hostigat 63.145.169.2     3 u   21  128  377   16.037   13.546   4.090
*pacific.latt.ne 241.199.164.101  2 u   72  128  377   15.148   24.434   7.403

# Local node 1
==============================================================================
*service-a-1.sn1 204.2.134.163    3 u    9  128  377    0.228    5.217   1.296

# Local node 2
==============================================================================
*service-a-1.sn1 204.2.134.163    3 u   91  128  377    0.200    3.608   1.167

這裡的關鍵是這一行：
丟棄對等點 0：stratum=0
將自己標識為第 0 層的 NTP 伺服器違反了規範（它是為原子鐘或類似的東西保留的）。幾年前我在一些 BSD 和 Mac OS X 主機上遇到了這個問題。我最終破解了原始碼的分層檢查並為“有問題的”主機維護了一個單獨的外掛建構。
如果您想將其撕掉，違規行是 254-257（目前，無論如何）。這是一個 hack，但它對我有用 ;-)
我在郵件列表檔案中找到了這個執行緒。我認為還有一個我建議添加一個命令行選項來忽略地層檢查，但我認為它沒有任何吸引力。
還有一個關於它的錯誤報告，但據我所知，它沒有產生任何有用的東西。

引用自：https://serverfault.com/questions/625027

Nagios - check_ntp_time - 偏移量未知

相關問答

用於檢查特定程序/服務是否正在執行的通用 Nagios 外掛？

centos 7中沒有ntpd，如何與chrony自動和手動同步時間？

NTP 本地查詢總是超時

Nagios NTP 時間：未知：主機查找失敗ARG1一種RG1ARG1(us.pool.ntp.org)

檢查NFS時如何調試Nagios“NRPE：無法讀取輸出”？

Nagios 不會開始，現在不會停止！