Linux
Debian 伺服器意外重啟
我實驗室的帶有 Debian-Wheezy-7.8-Stable 的伺服器在執行數小時後不斷重啟幾次,而沒有任何通知。該伺服器設置用於相當高負載的數值計算以及並行計算。我已經列印了日誌
var/log/messages
,last reboot
但我發現很難理解這些日誌消息。我試圖在重啟時間發生之前查看條目並查看同一時間,var/log/messages
但似乎var/log/messages
重啟後的條目僅顯示日誌/消息。我瀏覽了一下,發現有些人遇到了同樣的問題,但原因似乎彼此不同,
/var/log/messages
似乎是調查問題的關鍵。var/log/messages
關於這個不需要的重啟事件,我的實際描述是什麼?以及如何開始學習如何為初學者閱讀此日誌?我的意思是有什麼重要的關鍵字要尋找嗎?感謝您提供任何幫助。
last reboot
reboot system boot 3.2.0-4-amd64 Wed May 20 03:29 - 12:43 (09:14) reboot system boot 3.2.0-4-amd64 Tue May 19 16:01 - 12:43 (20:42)
var/log/messages
May 18 07:35:01 labserver rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="2400" x-info="http://www.rsyslog.com"] rsyslogd was HUPed May 19 07:35:01 labserver rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="2400" x-info="http://www.rsyslog.com"] rsyslogd was HUPed May 19 16:01:19 labserver kernel: imklog 5.8.11, log source = /proc/kmsg started. May 19 16:01:19 labserver rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="2401" x-info="http://www.rsyslog.com"] start May 19 16:01:19 labserver kernel: [ 0.000000] Initializing cgroup subsys cpuset May 19 16:01:19 labserver kernel: [ 0.000000] Initializing cgroup subsys cpu May 19 16:01:19 labserver kernel: [ 0.000000] Linux version 3.2.0-4-amd64 (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.65-1+deb7u2 May 19 16:01:19 labserver kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-4-amd64 root=UUID=1fc245ac-9058-4208-862a-7f4e8e1b20b2 ro text May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-provided physical RAM map: May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009ac00 (usable) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 0000000000100000 - 000000007df71000 (usable) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 000000007df71000 - 000000007e0f1000 (reserved) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 000000007e0f1000 - 000000007e2ec000 (ACPI NVS) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 000000007e2ec000 - 000000007f367000 (reserved) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 000000007f367000 - 000000007f800000 (ACPI NVS) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 0000000080000000 - 0000000090000000 (reserved) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 00000000fed1c000 - 00000000fed40000 (reserved) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved) May 19 16:01:19 labserver kernel: [ 0.000000] BIOS-e820: 0000000100000000 - 0000000880000000 (usable) May 19 16:01:19 labserver kernel: [ 0.000000] NX (Execute Disable) protection: active May 19 16:01:19 labserver kernel: [ 0.000000] SMBIOS 2.7 present. May 19 16:01:19 labserver kernel: [ 0.000000] No AGP bridge found May 19 16:01:19 labserver kernel: [ 0.000000] last_pfn = 0x880000 max_arch_pfn = 0x400000000 May 19 16:01:19 labserver kernel: [ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 May 19 16:01:19 labserver kernel: [ 0.000000] last_pfn = 0x7df71 max_arch_pfn = 0x400000000 May 19 16:01:19 labserver kernel: [ 0.000000] found SMP MP-table at [ffff8800000fd900] fd900 May 19 16:01:19 labserver kernel: [ 0.000000] Using GB pages for direct mapping May 19 16:01:19 labserver kernel: [ 0.000000] init_memory_mapping: 0000000000000000-000000007df71000 May 19 16:01:19 labserver kernel: [ 0.000000] init_memory_mapping: 0000000100000000-0000000880000000 May 19 16:01:19 labserver kernel: [ 0.000000] RAMDISK: 36bea000 - 375ed000 May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: RSDP 00000000000f04a0 00024 (v02 ALASKA) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: XSDT 000000007e204088 0008C (v01 ALASKA A M I 01072009 AMI 00010013) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: FACP 000000007e211040 0010C (v05 ALASKA A M I 01072009 AMI 00010013) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI Warning: FADT (revision 5) is longer than ACPI 2.0 version, truncating length 268 to 244 (20110623/tbfadt-288) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: DSDT 000000007e2041a8 0CE96 (v02 ALASKA A M I 00000015 INTL 20051117) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: FACS 000000007e2e3080 00040 May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: APIC 000000007e211150 00100 (v03 ALASKA A M I 01072009 AMI 00010013) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: FPDT 000000007e211250 00044 (v01 ALASKA A M I 01072009 AMI 00010013) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: MCFG 000000007e211298 0003C (v01 ALASKA OEMMCFG. 01072009 MSFT 00000097) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: HPET 000000007e2112d8 00038 (v01 ALASKA A M I 01072009 AMI. 00000005) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: PRAD 000000007e211310 000BE (v02 PRADID PRADTID 00000001 MSFT 03000001) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: SPMI 000000007e2113d0 00040 (v05 A M I OEMSPMI 00000000 AMI. 00000000) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: SSDT 000000007e211410 D0CB0 (v02 INTEL CpuPm 00004000 INTL 20051117) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: EINJ 000000007e2e20c0 00130 (v01 AMI AMI EINJ 00000000 00000000) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: ERST 000000007e2e21f0 00230 (v01 AMIER AMI ERST 00000000 00000000) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: HEST 000000007e2e2420 000A8 (v01 AMI AMI HEST 00000000 00000000) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: BERT 000000007e2e24c8 00030 (v01 AMI AMI BERT 00000000 00000000) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: DMAR 000000007e2e24f8 000C4 (v01 A M I OEMDMAR 00000001 INTL 00000001) May 19 16:01:19 labserver kernel: [ 0.000000] No NUMA configuration found May 19 16:01:19 labserver kernel: [ 0.000000] Faking a node at 0000000000000000-0000000880000000 May 19 16:01:19 labserver kernel: [ 0.000000] Initmem setup node 0 0000000000000000-0000000880000000 May 19 16:01:19 labserver kernel: [ 0.000000] NODE_DATA [000000087fffb000 - 000000087fffffff] May 19 16:01:19 labserver kernel: [ 0.000000] Zone PFN ranges: May 19 16:01:19 labserver kernel: [ 0.000000] DMA 0x00000010 -> 0x00001000 May 19 16:01:19 labserver kernel: [ 0.000000] DMA32 0x00001000 -> 0x00100000 May 19 16:01:19 labserver kernel: [ 0.000000] Normal 0x00100000 -> 0x00880000 May 19 16:01:19 labserver kernel: [ 0.000000] Movable zone start PFN for each node May 19 16:01:19 labserver kernel: [ 0.000000] early_node_map[3] active PFN ranges May 19 16:01:19 labserver kernel: [ 0.000000] 0: 0x00000010 -> 0x0000009a May 19 16:01:19 labserver kernel: [ 0.000000] 0: 0x00000100 -> 0x0007df71 May 19 16:01:19 labserver kernel: [ 0.000000] 0: 0x00100000 -> 0x00880000 May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: PM-Timer IO Port: 0x408 May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x08] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x0a] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x09] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x0b] enabled) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x0a] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x09] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x0b] high edge lint[0x1]) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0]) May 19 16:01:19 labserver kernel: [ 0.000000] IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23 May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec01000] gsi_base[24]) May 19 16:01:19 labserver kernel: [ 0.000000] IOAPIC[1]: apic_id 2, version 32, address 0xfec01000, GSI 24-47 May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) May 19 16:01:19 labserver kernel: [ 0.000000] Using ACPI (MADT) for SMP configuration information May 19 16:01:19 labserver kernel: [ 0.000000] ACPI: HPET id: 0x8086a701 base: 0xfed00000 May 19 16:01:19 labserver kernel: [ 0.000000] SMP: Allowing 12 CPUs, 0 hotplug CPUs May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 000000000009a000 - 000000000009b000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 000000000009b000 - 00000000000a0000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 00000000000e0000 - 0000000000100000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 000000007df71000 - 000000007e0f1000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 000000007e0f1000 - 000000007e2ec000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 000000007e2ec000 - 000000007f367000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 000000007f367000 - 000000007f800000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 000000007f800000 - 0000000080000000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 0000000080000000 - 0000000090000000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 0000000090000000 - 00000000fed1c000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 00000000fed1c000 - 00000000fed40000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 00000000fed40000 - 00000000ff000000 May 19 16:01:19 labserver kernel: [ 0.000000] PM: Registered nosave memory: 00000000ff000000 - 0000000100000000 May 19 16:01:19 labserver kernel: [ 0.000000] Allocating PCI resources starting at 90000000 (gap: 90000000:6ed1c000) May 19 16:01:19 labserver kernel: [ 0.000000] Booting paravirtualized kernel on bare hardware May 19 16:01:19 labserver kernel: [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:12 nr_node_ids:1 May 19 16:01:19 labserver kernel: [ 0.000000] PERCPU: Embedded 27 pages/cpu @ffff88087fc00000 s78848 r8192 d23552 u131072 May 19 16:01:19 labserver kernel: [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 8258294 May 19 16:01:19 labserver kernel: [ 0.000000] Policy zone: Normal May 19 16:01:19 labserver kernel: [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-4-amd64 root=UUID=1fc245ac-9058-4208-862a-7f4e8e1b20b2 ro text May 19 16:01:19 labserver kernel: [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) May 19 16:01:19 labserver kernel: [ 0.000000] xsave/xrstor: enabled xstate_bv 0x7, cntxt size 0x340 May 19 16:01:19 labserver kernel: [ 0.000000] Checking aperture... May 19 16:01:19 labserver kernel: [ 0.000000] No AGP bridge found May 19 16:01:19 labserver kernel: [ 0.000000] Memory: 32975732k/35651584k available (3434k kernel code, 2130964k absent, 544888k reserved, 3305k data, 576k init) May 19 16:01:19 labserver kernel: [ 0.000000] Hierarchical RCU implementation. May 19 16:01:19 labserver kernel: [ 0.000000] RCU dyntick-idle grace-period acceleration is enabled. May 19 16:01:19 labserver kernel: [ 0.000000] NR_IRQS:33024 nr_irqs:1184 16 May 19 16:01:19 labserver kernel: [ 0.000000] Extended CMOS year: 2000 May 19 16:01:19 labserver kernel: [ 0.000000] Console: colour VGA+ 80x25 May 19 16:01:19 labserver kernel: [ 0.000000] console [tty0] enabled May 19 16:01:19 labserver kernel: [ 0.000000] Fast TSC calibration using PIT May 19 16:01:19 labserver kernel: [ 0.004000] Detected 2100.074 MHz processor. May 19 16:01:19 labserver kernel: [ 0.000003] Calibrating delay loop (skipped), value calculated using timer frequency.. 4200.14 BogoMIPS (lpj=8400296) May 19 16:01:19 labserver kernel: [ 0.000144] pid_max: default: 32768 minimum: 301 May 19 16:01:19 labserver kernel: [ 0.000253] Security Framework initialized May 19 16:01:19 labserver kernel: [ 0.000324] AppArmor: AppArmor disabled by boot time parameter May 19 16:01:19 labserver kernel: [ 0.002355] Dentry cache hash table entries: 4194304 (order: 13, 33554432 bytes) May 19 16:01:19 labserver kernel: [ 0.011585] Inode-cache hash table entries: 2097152 (order: 12, 16777216 bytes) May 19 16:01:19 labserver kernel: [ 0.015724] Mount-cache hash table entries: 256 May 19 16:01:19 labserver kernel: [ 0.015915] Initializing cgroup subsys cpuacct May 19 16:01:19 labserver kernel: [ 0.015986] Initializing cgroup subsys memory May 19 16:01:19 labserver kernel: [ 0.016063] Initializing cgroup subsys devices May 19 16:01:19 labserver kernel: [ 0.016133] Initializing cgroup subsys freezer May 19 16:01:19 labserver kernel: [ 0.016201] Initializing cgroup subsys net_cls May 19 16:01:19 labserver kernel: [ 0.016270] Initializing cgroup subsys blkio May 19 16:01:19 labserver kernel: [ 0.016344] Initializing cgroup subsys perf_event May 19 16:01:19 labserver kernel: [ 0.016441] CPU: Physical Processor ID: 0 May 19 16:01:19 labserver kernel: [ 0.016509] CPU: Processor Core ID: 0 May 19 16:01:19 labserver kernel: [ 0.017564] mce: CPU supports 23 MCE banks May 19 16:01:19 labserver kernel: [ 0.017670] CPU0: Thermal monitoring enabled (TM1) May 19 16:01:19 labserver kernel: [ 0.017768] using mwait in idle threads. May 19 16:01:19 labserver kernel: [ 0.018315] ACPI: Core revision 20110623 May 19 16:01:19 labserver kernel: [ 0.049889] DMAR: Host address width 46 May 19 16:01:19 labserver kernel: [ 0.049958] DMAR: DRHD base: 0x000000fbffc000 flags: 0x1 May 19 16:01:19 labserver kernel: [ 0.050034] IOMMU 0: reg_base_addr fbffc000 ver 1:0 cap d2078c106f0466 ecap f020de May 19 16:01:19 labserver kernel: [ 0.050122] DMAR: RMRR base: 0x0000007f239000 end: 0x0000007f247fff May 19 16:01:19 labserver kernel: [ 0.050195] DMAR: ATSR flags: 0x0 May 19 16:01:19 labserver kernel: [ 0.050261] DMAR: RHSA base: 0x000000fbffc000 proximity domain: 0x0 May 19 16:01:19 labserver kernel: [ 0.050427] IOAPIC id 0 under DRHD base 0xfbffc000 IOMMU 0 May 19 16:01:19 labserver kernel: [ 0.050497] IOAPIC id 2 under DRHD base 0xfbffc000 IOMMU 0 May 19 16:01:19 labserver kernel: [ 0.050568] HPET id 0 under DRHD base 0xfbffc000 May 19 16:01:19 labserver kernel: [ 0.050741] Enabled IRQ remapping in x2apic mode May 19 16:01:19 labserver kernel: [ 0.050810] Enabling x2apic May 19 16:01:19 labserver kernel: [ 0.050875] Enabled x2apic May 19 16:01:19 labserver kernel: [ 0.050943] Switched APIC routing to cluster x2apic. May 19 16:01:19 labserver kernel: [ 0.051552] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 May 19 16:01:19 labserver kernel: [ 0.091256] CPU0: Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz stepping 04 May 19 16:01:19 labserver kernel: [ 0.195570] Performance Events: PEBS fmt1+, generic architected perfmon, Intel PMU driver. May 19 16:01:19 labserver kernel: [ 0.195802] ... version: 3 May 19 16:01:19 labserver kernel: [ 0.195869] ... bit width: 48 May 19 16:01:19 labserver kernel: [ 0.195936] ... generic registers: 4 May 19 16:01:19 labserver kernel: [ 0.196003] ... value mask: 0000ffffffffffff May 19 16:01:19 labserver kernel: [ 0.196073] ... max period: 000000007fffffff May 19 16:01:19 labserver kernel: [ 0.196143] ... fixed-purpose events: 3 May 19 16:01:19 labserver kernel: [ 0.196210] ... event mask: 000000070000000f May 19 16:01:19 labserver kernel: [ 0.196468] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 0.196637] Booting Node 0, Processors #1 May 19 16:01:19 labserver kernel: [ 0.312587] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 0.312765] #2 May 19 16:01:19 labserver kernel: [ 0.424400] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 0.424578] #3 May 19 16:01:19 labserver kernel: [ 0.536316] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 0.536489] #4 May 19 16:01:19 labserver kernel: [ 0.648124] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 0.648303] #5 May 19 16:01:19 labserver kernel: [ 0.759941] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 0.760115] #6 May 19 16:01:19 labserver kernel: [ 0.871864] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 0.872050] #7 May 19 16:01:19 labserver kernel: [ 0.983690] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 0.983866] #8 May 19 16:01:19 labserver kernel: [ 1.095600] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 1.095774] #9 May 19 16:01:19 labserver kernel: [ 1.207414] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 1.207589] #10 May 19 16:01:19 labserver kernel: [ 1.319223] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 1.319400] #11 Ok. May 19 16:01:19 labserver kernel: [ 1.431095] NMI watchdog enabled, takes one hw-pmu counter. May 19 16:01:19 labserver kernel: [ 1.431192] Brought up 12 CPUs May 19 16:01:19 labserver kernel: [ 1.431260] Total of 12 processors activated (50398.84 BogoMIPS). May 19 16:01:19 labserver kernel: [ 1.450786] devtmpfs: initialized May 19 16:01:19 labserver kernel: [ 1.455360] PM: Registering ACPI NVS region at 7e0f1000 (2076672 bytes) May 19 16:01:19 labserver kernel: [ 1.455494] PM: Registering ACPI NVS region at 7f367000 (4820992 bytes) May 19 16:01:19 labserver kernel: [ 1.455843] print_constraints: dummy: May 19 16:01:19 labserver kernel: [ 1.455977] NET: Registered protocol family 16 May 19 16:01:19 labserver kernel: [ 1.456140] ACPI: bus type pci registered May 19 16:01:19 labserver kernel: [ 1.456268] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000) May 19 16:01:19 labserver kernel: [ 1.456361] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820 May 19 16:01:19 labserver kernel: [ 1.466673] PCI: Using configuration type 1 for base access May 19 16:01:19 labserver kernel: [ 1.468173] bio: create slab <bio-0> at 0 May 19 16:01:19 labserver kernel: [ 1.468353] ACPI: Added _OSI(Module Device) May 19 16:01:19 labserver kernel: [ 1.468422] ACPI: Added _OSI(Processor Device) May 19 16:01:19 labserver kernel: [ 1.468491] ACPI: Added _OSI(3.0 _SCP Extensions) May 19 16:01:19 labserver kernel: [ 1.468560] ACPI: Added _OSI(Processor Aggregator Device) May 19 16:01:19 labserver kernel: [ 1.484562] ACPI: Executed 1 blocks of module-level executable AML code May 19 16:01:19 labserver kernel: [ 1.727818] ACPI: Interpreter enabled May 19 16:01:19 labserver kernel: [ 1.727891] ACPI: (supports S0 S1 S4 S5) May 19 16:01:19 labserver kernel: [ 1.728159] ACPI: Using IOAPIC for interrupt routing May 19 16:01:19 labserver kernel: [ 1.736531] ACPI: No dock devices found. May 19 16:01:19 labserver kernel: [ 1.736630] HEST: Table parsing has been initialized. May 19 16:01:19 labserver kernel: [ 1.736704] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug May 19 16:01:19 labserver kernel: [ 1.737041] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-fe]) May 19 16:01:19 labserver kernel: [ 1.737361] pci_root PNP0A08:00: host bridge window [io 0x0000-0x03af] May 19 16:01:19 labserver kernel: [ 1.737435] pci_root PNP0A08:00: host bridge window [io 0x03e0-0x0cf7] May 19 16:01:19 labserver kernel: [ 1.737508] pci_root PNP0A08:00: host bridge window [io 0x03b0-0x03df] May 19 16:01:19 labserver kernel: [ 1.737586] pci_root PNP0A08:00: host bridge window [io 0x0d00-0xffff] May 19 16:01:19 labserver kernel: [ 1.737659] pci_root PNP0A08:00: host bridge window [mem 0x000a0000-0x000bffff] May 19 16:01:19 labserver kernel: [ 1.737747] pci_root PNP0A08:00: host bridge window [mem 0x000c0000-0x000dffff] May 19 16:01:19 labserver kernel: [ 1.737834] pci_root PNP0A08:00: host bridge window [mem 0xfed0e000-0xfed0ffff] May 19 16:01:19 labserver kernel: [ 1.737922] pci_root PNP0A08:00: host bridge window [mem 0x80000000-0xfbffffff] May 19 16:01:19 labserver kernel: [ 1.740791] pci 0000:00:01.0: PCI bridge to [bus 01-01] May 19 16:01:19 labserver kernel: [ 1.745575] pci 0000:00:01.1: PCI bridge to [bus 02-03] May 19 16:01:19 labserver kernel: [ 1.745700] pci 0000:00:02.0: PCI bridge to [bus 04-04] May 19 16:01:19 labserver kernel: [ 1.745816] pci 0000:00:03.0: PCI bridge to [bus 05-05] May 19 16:01:19 labserver kernel: [ 1.745933] pci 0000:00:03.2: PCI bridge to [bus 06-06] May 19 16:01:19 labserver kernel: [ 1.746285] pci 0000:00:11.0: PCI bridge to [bus 07-07] May 19 16:01:19 labserver kernel: [ 1.746541] pci 0000:00:1e.0: PCI bridge to [bus 08-08] (subtractive decode) May 19 16:01:19 labserver kernel: [ 1.747170] pci0000:00: Requesting ACPI _OSC control (0x1d) May 19 16:01:19 labserver kernel: [ 1.747465] pci0000:00: ACPI _OSC control (0x15) granted May 19 16:01:19 labserver kernel: [ 1.756901] ACPI: PCI Root Bridge [UNC0] (domain 0000 [bus ff]) May 19 16:01:19 labserver kernel: [ 1.758443] pci0000:ff: Requesting ACPI _OSC control (0x1d) May 19 16:01:19 labserver kernel: [ 1.758528] pci0000:ff: ACPI _OSC control (0x1d) granted May 19 16:01:19 labserver kernel: [ 1.759439] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) May 19 16:01:19 labserver kernel: [ 1.760105] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *10 11 12 14 15) May 19 16:01:19 labserver kernel: [ 1.760768] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 6 10 11 12 14 15) May 19 16:01:19 labserver kernel: [ 1.761383] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 10 *11 12 14 15) May 19 16:01:19 labserver kernel: [ 1.762006] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0 May 19 16:01:19 labserver kernel: [ 1.762729] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0 May 19 16:01:19 labserver kernel: [ 1.763450] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0 May 19 16:01:19 labserver kernel: [ 1.764170] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 *7 10 11 12 14 15)
您需要提供更多資訊,尤其是在系統重新啟動之前的日誌條目。但是據我所知,它可能無法提供更多資訊。檢查其他日誌,例如 syslog。
根據我的經驗,突然重啟而沒有任何跡象表明出了什麼問題,最常見的原因通常與硬體有關。否則核心將有機會在日誌中寫入一些內容以提供線索。
突然重啟的一些常見原因:
- 過熱,可能是主要原因,了解溫度,嘗試記錄它,伺服器是否有可以顯示溫度的顯示器,房間是否適當冷卻。或許更換覆蓋 CPU 的散熱器上的導熱膏。
- 壞硬體或驅動程序,例如使用“lspci”獲取它的列表,一個壞的調光器可能會導致系統突然掛起和/或重新啟動(重新安裝調光器、CPU 和卡)。我記得由於英特爾乙太網卡的問題偶爾會重新啟動伺服器。有時壞磁碟也會導致此類問題,儘管通常它只會導致它掛起而不是重新啟動。
- 一個壞的 UPS,我記得一個電池供電的 UPS 慢慢壞了,它這樣做的一個指標是連接到它的伺服器每周定期通電。您可能只是配置錯誤的電源循環時間表。