在 Google Compute Engine 中使用 Stackdriver 監控 MongoDB 3.2 靜默失敗
截至 2016 年 8 月 28 日,我在使用 Stackdriver 監控 MongoDB 3.2 時遇到問題。
沒有提到
mongo
任何內容,/var/log/syslog
但如果我在文件上犯了配置錯誤.conf
,它會抱怨,所以我知道它正在正確載入文件……所以沒有錯誤,但
mongo
沒有提及https://app.google.stackdriver.com/services/mongodb聲稱我沒有安裝代理。/var/log/syslog
gke-fatih-standard-fb894cbb-d7ue:/opt/stackdriver/collectd/etc$ sudo service stackdriver-agent restart [....] Restarting Stackdriver metrics collection agent: stackdriver-agentoption = Interval; value = 60.000000; Created new plugin context. option = Interval; value = 60.000000; Created new plugin context. option = PIDFile; value = /var/run/stackdriver-agent.pid; option = Interval; value = 60.000000; Created new plugin context. . ok $ tail -F /var/log/syslog Aug 28 06:53:01 gke-fatih-standard-fb894cbb-d7ue /USR/SBIN/CRON[21824]: (root) CMD (/etc/supervisor/supervisor_watcher.sh 2>&1 | logger) Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21844]: type = syslog, key = LogLevel, value = info Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21844]: write_gcm: inside module_register for stackdriver_agent/5.5.0-340.wheezy Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21845]: type = syslog, key = LogLevel, value = info Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21845]: write_gcm: inside module_register for stackdriver_agent/5.5.0-340.wheezy Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: Initialization complete, entering read-loop. Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: match_throttle_metadata_keys: 1 history entries, 1 distinct keys, 78 bytes server memory. Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: tcpconns plugin: Reading from netlink succeeded. Will use the netlink method from now on. Aug 28 06:53:03 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: write_gcm: Asking metadata server for auth token Aug 28 06:53:04 gke-fatih-standard-fb894cbb-d7ue collectd[21846]: match_throttle_metadata_keys: 2 history entries, 1025 distinct keys, 102801 bytes server memory.
請注意,實例/節點被正確監控,只有 MongoDB 有問題。
/opt/stackdriver/collectd/etc/collect.d/mongo0.conf
:# scheduled to node: gke-fatih-standard-fb894cbb-d7ue # This is the monitoring configuration for MongoDB. # Look for STATS_USER, STATS_PASS, MONGODB_HOST and MONGODB_PORT to adjust your configuration file. LoadPlugin mongodb <Plugin "mongodb"> # When using non-standard MongoDB configurations, replace the below with #Host "MONGODB_HOST" #Port "MONGODB_PORT" # Must use the load balancer because we don't know the fixed nodePort Host "xxx" Port "27017" # If you restricted access to the database, you can set the username and # password here: User "stats" Password "xxx" </Plugin>
Google正在棄用他們專注於非 GCP 的 Stackdriver 集成(如 Mongo),並轉向 BindPlane MIaaS 平台作為他們支持的非 GCP 數據源監控集成平台。
更多詳情可在這找到:
https://cloud.google.com/monitoring/agent/plugins/bindplane-transition
和這裡:
https://bluemedora.com/how-to-monitor-mongodb-bindplane-for-stackdriver-blue-medora/
再做
sudo service stackdriver-agent restart
一次(我以前做過)和大約 30 分鐘的原始事件之後,現在 Stackdriver 檢測到這些指標。因此,如果您確定一切正確且沒有錯誤,您可以嘗試
stackdriver-agent
多次重啟並等待約 30 分鐘。缺乏任何
mongo
相關的東西/var/log/syslog
是一個問題。我希望@Corey-Kosak 可以提供更多資訊。