Nagios NTP 時間：未知：主機查找失敗ARG1一種RG1ARG1(us.pool.ntp.org)

May 21, 2018

我們目前正在使用 Nagios 來監控我工作場所的生產伺服器。我們的 Nagios 實例配置在我們監控 Linux 和 Windows 機器的 Linux 伺服器上。
我已經在我們的幾個 Windows 伺服器上遇到以下關於 NTP 時間的 Nagios 問題已有一段時間了：
請參閱下面給我帶來麻煩的命令：
check_windows_time!us.pool.ntp.org!3000!6000
看起來 $ ARG1 $ 是’us.pool.ntp.org’。主機查找失敗是什麼意思 $ ARG1 $ ’ 相當於？這些伺服器是否在解析 NTP 主機 (us.pool.ntp.org) 時遇到問題？如果是這樣，我只是好奇為什麼有些伺服器在解析該主機時遇到問題，而其他伺服器卻沒有？我在許多其他伺服器上使用相同的命令沒有問題。
請注意，所有其他監控語句在遇到此問題的伺服器上執行良好（磁碟空間、CPU 使用情況、RAM 使用情況等）。似乎只是 NTP 命令給我帶來了麻煩。
我在許多其他伺服器上以相同的方式配置了 NTP，但我沒有遇到這個問題，所以我不知道是什麼導致了這個問題。
以前有沒有人遇到過類似的錯誤？
如果您需要任何其他資訊，請告訴我，我很樂意澄清。
謝謝！
編輯 1：如果它有幫助，我可以從受影響的伺服器 nslookup ‘us.pool.ntp.org’。因此，有問題的伺服器能夠解析該 DNS 名稱。
編輯 2： NSC.ini ‘check_windows_time’ 配置：
check_windows_time=check_windows_time.bat $ARG1$ $ARG2$ $ARG3$
check_windows_time.bat：
@echo off
SETLOCAL
rem ***************************************************
rem Check_Windows_Time.bat
rem
rem Author: Michael van den Berg
rem Copyright 2012 - PCS-IT Services B.V. (www.pcs-it.nl)
rem
rem This Nagios plugin will check the time offset
rem against a specified time server.
rem ***************************************************

if [%1]==[] (goto usage) else (set time_server=%1)
if [%1]==[/?] (goto usage) else (set time_server=%1)
if [%2]==[] (set warn_offset=nul) else (set warn_offset=%2)
if [%2]==[$ARG2$] set warn_offset=nul
if [%3]==[] (set crit_offset=nul) else (set crit_offset=%3)
if [%3]==[$ARG3$] set crit_offset=nul

for /f "tokens=*" %%t in ('w32tm /stripchart /computer:%time_server% /samples:1 /dataonly') do set output=%%t

if not "x%output:0x80072af9=%"=="x%output%" goto host_error
if not "x%output:0x800705B4=%"=="x%output%" goto comm_error
if not "x%output:error=%"=="x%output%" goto unknown_error
if not "x%output:)=%"=="x%output%" goto unknown_error

set time_org=%output:*, =%
set time=%time_org:~1,-9%

if %warn_offset% == nul (set warn_perf=0) else (set warn_perf=%warn_offset%)
if %crit_offset% == nul (set crit_perf=0) else (set crit_perf=%crit_offset%)
set perf_data='Offset'=%time%s;%warn_perf%;%crit_perf%;0

if %time% geq %crit_offset% goto threshold_crit
if %time% geq %warn_offset% goto threshold_warn
if %time% lss %warn_offset% goto okay
goto unknown_error

:usage
echo %0 - Nagios plugin that checks time offset against a specified ntp server.
echo.
echo Usage:    %0 ^&lt;timeserver^&gt; ^&lt;warning threshold in seconds^&gt; ^&lt;critical threshold in seconds^&gt;
echo Examples: %0 us.pool.ntp.org 120 300
echo           %0 my-domain-controller.local 120 300
exit /b 3

:host_error
echo UNKNOWN: Lookup failure for host %time_server%
exit /b 3

:comm_error
echo UNKNOWN: Unable to query NTP service at %time_server% (Port 123 blocked/closed)
exit /b 3

:threshold_crit
echo CRITICAL: Time is %time_org% from %time_server%^|%perf_data%
exit /b 2

:threshold_warn
echo WARNING: Time is %time_org% from %time_server%^|%perf_data%
exit /b 1

:okay
echo OK: Time is %time_org% from %time_server%^|%perf_data%
exit /b 0

:unknown_error
echo UNKNOWN: Unable to check time (command error)
exit /b 3
編輯 3：我收到的錯誤消息看起來是滿足以下條件的結果：
if not "x%output:0x80072af9=%"=="x%output%" goto host_error
有誰知道這意味著什麼或我如何解決這個問題？

我終於能夠讓這些 NTP 錯誤消失。
首先，由於我們啟用了 Windows 防火牆，我在出站連接設置 (123) 中解鎖了檢查 NTP 時間所需的埠。我注意到這是問題所在，因為我嘗試從命令行執行我的“check_windows_time.bat”文件並收到錯誤。
Shout 從上面的評論中向使用者“Sorcha”發出建議我執行此測試。
然後，我將有問題的 NSC.ini 實例與我知道工作正常的版本進行了比較。工作 .ini 文件與遇到問題的伺服器之間存在一些差異。我修改了有問題的 .ini 文件以匹配工作文件並重新啟動 NSClient++ 服務。
我還重新啟動了 Nagios。一段時間後，我的錯誤清除了！
感謝您的幫助。

引用自：https://serverfault.com/questions/910025

Nagios NTP 時間：未知：主機查找失敗ARG1一種RG1ARG1(us.pool.ntp.org)

相關問答

無法從 Windows 10 連接到 Linux Samba 共享

嘗試從 Windows 客戶端連接時出現 Samba 身份驗證問題

Nagios：在 Windows/Linux 上禁用命令行通知

如何在ubuntu創建的邏輯分區上安裝windows

異構環境中的時間同步

用於檢查特定程序/服務是否正在執行的通用 Nagios 外掛？