Windows-Server-2008

間歇性 Windows Server 2008 藍屏並重新啟動

  • May 23, 2019

我們的 EC2 實例 (Windows Server 2008) 在過去 3 個月內多次崩潰(最後一次是今天美國東部標準時間 1:05)。在查看 MEMORY.DMP 文件後,我們注意到崩潰的可能原因是 rhelnet.sys(RedHat PV NIC 驅動程序)。

伺服器的事件查看器在崩潰後立即有以下記錄:

Critical - Kernel Power:
The system has rebooted without cleanly shutting down first. 
This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

BugCheck:
The computer has rebooted from a bugcheck.  The bugcheck was:
0x000000d1 (0x000000000000002d, 0x0000000000000002, 0x0000000000000000, 0xfffff88001402d14). 
A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 100113-35849-01.

這可能是硬體問題嗎?如果我們停止並啟動實例會有幫助嗎?或者這更有可能是由系統上執行的軟體引起的?

$$ Update 10.01.2013 $$

Amazon 代表建議在我們的實例上將 RH 驅動程序更新為 Citrix PV 驅動程序:

升級光伏驅動程序

$$ Update 10.08.2013 $$

我們在複製的實例上執行了驅動程序升級。升級後,我們在事件查看器中註意到以下錯誤:

Xennet6 errors in Event Viewer (Event ID# 5001)

在深入探勘之後,我發現這篇文章建議安裝最新的 Citrix 驅動程序。不幸的是,這根本沒有幫助我們,我們複製的實例變得沒有響應。

$$ Update 10.08.2013 2 $$

我重新創建了一個實例並再次更新了 PV 驅動程序。在網際網路上搜尋後,我發現這篇文章亞馬遜代表解釋說:

"Event ID 5001 from source Xennet6 cannot be found" message does not 
indicate anything wrong, just that the PV driver is looking for a feature
that we have not implemented in our version of Xen. 

我會讓我的測試系統執行一段時間,看看它是否有任何問題。

按照亞馬遜代表的建議升級驅動程序修復了這個問題。

關於Event ID 5001...以下問題是我從亞馬遜得到的答复:

Please ignore the Xennet 5001 error. This error occurs on every instance
that is launched with Citrix PV drivers and is due to the driver looking
for a feature that is not supported on EC2. It will have no other effect on the instance.

我有同樣的問題。

但是 AWS Supporter 回答我如下,他們不確定 Citrix PV 驅動器的問題。

Currently, we are unable to root cause the issue.
In my personal opinion, this might be a one-time only occurrence,
but as you are running Citrix PV Drivers, I highly encourage you to upgrade.

As the Citrix drivers show up in the logs,
they might had been related to the issue.

引用自:https://serverfault.com/questions/543075