Systemd

如何區分 RHEL7 上的崩潰和重啟?

  • September 20, 2016

有沒有辦法確定 RHEL7 伺服器是否通過 systemctl(或 reboot / shutdown 別名)重新啟動,或者伺服器是否崩潰?Pre-systemd 使用 很容易確定last -x runlevel,但使用 RHEL7 就不太清楚了。

有不止一種方法可以做到這一點,但我會介紹我能想到的 4 種最好的方法。(編輯:我在 redhat.com 上作為公開文章發布了此版本的清理版本。請參閱:如何區分 RHEL 7 中的崩潰和正常重啟。)

(1) 審核日誌

審計是驚人的。您可以通過檢查來查看它記錄的所有不同事件ausearch -m。針對手頭的問題,它會記錄系統關閉和系統啟動,因此您可以使用命令ausearch -i -m system_boot,system_shutdown | tail -4。如果這報告一個SYSTEM_SHUTDOWN後跟一個SYSTEM_BOOT,那麼一切都很好;但是,如果它連續報告 2條 SYSTEM_BOOT行,則顯然系統沒有正常關閉,如下例所示:

[root@a72 ~]# ausearch -i -m system_boot,system_shutdown | tail -4
----
type=SYSTEM_BOOT msg=audit(09/20/2016 01:10:32.392:7) : pid=657 uid=root auid=unset ses=unset subj=system_u:system_r:init_t:s0 msg=' comm=systemd-update-utmp exe=/usr/lib/systemd/systemd-update-utmp hostname=? addr=? terminal=? res=success' 
----
type=SYSTEM_BOOT msg=audit(09/20/2016 01:11:41.134:7) : pid=656 uid=root auid=unset ses=unset subj=system_u:system_r:init_t:s0 msg=' comm=systemd-update-utmp exe=/usr/lib/systemd/systemd-update-utmp hostname=? addr=? terminal=? res=success' 

(2) 最後-x

與上面相同,但使用簡單的last -n2 -x shutdown reboot命令。系統崩潰的例子:

[root@a72 ~]# last -n2 -x shutdown reboot
reboot   system boot  3.10.0-327.el7.x Tue Sep 20 01:11 - 01:20  (00:08)    
reboot   system boot  3.10.0-327.el7.x Tue Sep 20 01:10 - 01:20  (00:09)    

或者係統正常重啟的地方:

[root@a72 ~]# last -n2 -x shutdown reboot
reboot   system boot  3.10.0-327.el7.x Tue Sep 20 01:21 - 01:21  (00:00)    
shutdown system down  3.10.0-327.el7.x Tue Sep 20 01:21 - 01:21  (00:00)    

(3) 創建自己的服務單元

恕我直言,這是最好的方法,因為您可以根據需要對其進行調整。有一百萬種方法可以做到這一點。這是我剛編的一個。下一個服務僅在關機時執行。

[root@a72 ~]# cat /etc/systemd/system/set_gracefulshutdown.service
[Unit]
Description=Set flag for graceful shutdown
DefaultDependencies=no
RefuseManualStart=true
Before=shutdown.target

[Service]
Type=oneshot
ExecStart=/bin/touch /root/graceful_shutdown

[Install]
WantedBy=shutdown.target
[root@a72 ~]# systemctl enable set_gracefulshutdown.service 
Created symlink from /etc/systemd/system/shutdown.target.wants/set_gracefulshutdown.service to /etc/systemd/system/set_gracefulshutdown.service.

那麼當系統啟動時,只有在上面的shutdown服務創建的文件存在的情況下,這個next服務才會啟動。

[root@a72 ~]# cat /etc/systemd/system/check_graceful.service 
[Unit]
Description=Check if system booted after a graceful shutdown
ConditionPathExists=/root/graceful_shutdown
RefuseManualStart=true
RefuseManualStop=true

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/rm /root/graceful_shutdown

[Install]
WantedBy=multi-user.target
[root@a72 ~]# systemctl enable check_graceful
Created symlink from /etc/systemd/system/multi-user.target.wants/check_graceful.service to /etc/systemd/system/check_graceful.service.

因此,在任何給定時間,我都可以通過執行來檢查上一次啟動是否在正常關機後完成systemctl is-active check_graceful,例如:

[root@a72 ~]# systemctl is-active check_graceful && echo YAY || echo OH NOES
active
YAY
[root@a72 ~]# systemctl status check_graceful
● check_graceful.service - Check if system booted after a graceful shutdown
  Loaded: loaded (/etc/systemd/system/check_graceful.service; enabled; vendor preset: disabled)
  Active: active (exited) since Tue 2016-09-20 01:10:32 EDT; 20s ago
 Process: 669 ExecStart=/bin/rm /root/graceful_shutdown (code=exited, status=0/SUCCESS)
Main PID: 669 (code=exited, status=0/SUCCESS)
  CGroup: /system.slice/check_graceful.service

Sep 20 01:10:32 a72.example.com systemd[1]: Starting Check if system booted after a graceful shutdown...
Sep 20 01:10:32 a72.example.com systemd[1]: Started Check if system booted after a graceful shutdown.

或者這是在不正常關機之後:

[root@a72 ~]# systemctl is-active check_graceful && echo YAY || echo OH NOES
inactive
OH NOES
[root@a72 ~]# systemctl status check_graceful
● check_graceful.service - Check if system booted after a graceful shutdown
  Loaded: loaded (/etc/systemd/system/check_graceful.service; enabled; vendor preset: disabled)
  Active: inactive (dead)
Condition: start condition failed at Tue 2016-09-20 01:11:41 EDT; 16s ago
          ConditionPathExists=/root/graceful_shutdown was not met

Sep 20 01:11:41 a72.example.com systemd[1]: Started Check if system booted after a graceful shutdown.

(4) journalctl

值得一提的是,如果您配置systemd-journald為保留持久日誌,則可以使用journalctl -b -1 -n查看上次引導的最後幾行(預設為 10 行)(-b -2是之前的引導等)。系統正常重啟的範例:

[root@a72 ~]# mkdir /var/log/journal
[root@a72 ~]# systemctl -s SIGUSR1 kill systemd-journald
[root@a72 ~]# reboot
...
[root@a72 ~]# journalctl -b -1 -n
-- Logs begin at Tue 2016-09-20 01:01:15 EDT, end at Tue 2016-09-20 01:21:33 EDT. --
Sep 20 01:21:19 a72.example.com systemd[1]: Stopped Create Static Device Nodes in /dev.
Sep 20 01:21:19 a72.example.com systemd[1]: Stopping Create Static Device Nodes in /dev...
Sep 20 01:21:19 a72.example.com systemd[1]: Reached target Shutdown.
Sep 20 01:21:19 a72.example.com systemd[1]: Starting Shutdown.
Sep 20 01:21:19 a72.example.com systemd[1]: Reached target Final Step.
Sep 20 01:21:19 a72.example.com systemd[1]: Starting Final Step.
Sep 20 01:21:19 a72.example.com systemd[1]: Starting Reboot...
Sep 20 01:21:19 a72.example.com systemd[1]: Shutting down.
Sep 20 01:21:19 a72.example.com systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Sep 20 01:21:19 a72.example.com systemd-journal[483]: Journal stopped

如果你得到這樣的良好輸出,那麼顯然系統已正常關閉。也就是說,根據我的經驗,當壞事發生(系統崩潰)時,它並不是超級可靠的。有時索引會變得很奇怪。

引用自:https://serverfault.com/questions/789442