Centos
無法啟動 kdump
我的系統總是崩潰。所以我決定啟用 kdump 來查看問題,因為我無法在日誌文件中看到可能的錯誤。
我按照步驟從此處的站點設置 kdump。我的伺服器在 CentOS 5.8 和 16GB RAM 上執行。以下是我為配置 kdump 執行的步驟:
1. Install kexec-tools, `yum install kexec-tools` and follow the installation steps 2. Edit the /boot/grub/grub.conf to configure the kdump memory usage 3. Edit the /etc/kdump.cof to configure the target type to /var/crash/ and core_collector 4. Enable kdump through `chkconfig kdump on`. 5. Reboot the server
當我跑的時候
service kdump status
,它說Kdump is not operational
。我應該怎麼做才能使 kdump 執行。我錯過了要配置的東西嗎?我在 /boot/grub/grub.conf 和 /etc/kdump.conf 的內容下麵包含了下面是文件 /boot/grub/grub.conf 的內容
# grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/sda3 # initrd /initrd-version.img #boot=/dev/sda default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.18-308.el5) root (hd0,0) kernel /vmlinuz-2.6.18-308.el5 ro root=LABEL=/ crashkernel=128M initrd /initrd-2.6.18-308.el5.img
以下是文件 /etc/kdump.conf 的內容
# Configures where to put the kdump /proc/vmcore files # # This file contains a series of commands to perform (in order) when a # kernel crash has happened and the kdump kernel has been loaded. Directives in # this file are only applicable to the kdump initramfs, and have no effect if # the root filesystem is mounted and the normal init scripts are processed # # Currently only one dump target and path may be configured at once # if the configured dump target fails, the default action will be preformed # the default action may be configured with the default directive below. If the # configured dump target succedes # # For filesystem based dump, it's recommended to use UUID and LABEL # instead of device name in dump target. # # See the kdump.conf(5) man page for details of configuration directives #raw /dev/sda5 #ext3 /dev/sda3 #ext3 LABEL=/boot #ext3 UUID=03138356-5e61-4ab3-b58e-27507ac41937 #net my.server.com:/export/tmp #net user@my.server.com path /var/crash core_collector makedumpfile -c --message-level 1 #core_collector cp --sparse=always #link_delay 60 #kdump_post /var/crash/scripts/kdump-post.sh #extra_bins /usr/bin/lftp #disk_timeout 30 #extra_modules gfs2 #options modulename options #default shell #sshkey /root/.ssh/kdump_id_rsa
我還注意到我的 /boot/grub/grub.conf 文件與教程中的範例 grub.conf 文件不同。它們在兩行上有所不同:
From tutorial kernel /vmlinuz-2.6.32-220.el6.x86_64 ro root=/dev/sda3 initrd /initramfs-2.6.32-220.el6.x86_64.img From own conf kernel /vmlinuz-2.6.18-308.el5 ro root=LABEL=/ initrd /initrd-2.6.18-308.el5.img
這些行會導致 kdump 無法啟動嗎?
$$ EDIT 1 $$ /var/log/messages 的內容
Feb 25 02:18:28 61540 kernel: Command line: ro root=LABEL=/ crashkernel=128M Feb 25 02:18:28 61540 kernel: BIOS-provided physical RAM map: Feb 25 02:18:28 61540 kernel: BIOS-e820: 0000000000010000 - 000000000009a000 (usable) Feb 25 02:18:28 61540 kernel: BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved) Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) Feb 25 02:18:28 61540 kernel: BIOS-e820: 0000000000100000 - 00000000cfda0000 (usable) Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000cfda0000 - 00000000cfdd1000 (ACPI NVS) Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000cfdd1000 - 00000000cfe00000 (ACPI data) Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000cfe00000 - 00000000cff00000 (reserved) Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) Feb 25 02:18:28 61540 kernel: BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) Feb 25 02:18:28 61540 kernel: BIOS-e820: 0000000100000000 - 000000042f000000 (usable) Feb 25 02:18:28 61540 kernel: DMI 2.4 present. Feb 25 02:18:28 61540 kernel: No NUMA configuration found Feb 25 02:18:28 61540 kernel: Faking a node at 0000000000000000-000000042f000000 Feb 25 02:18:28 61540 kernel: Bootmem setup node 0 0000000000000000-000000042f000000 Feb 25 02:18:28 61540 kernel: Memory for crash kernel (0x0 to 0x0) notwithin permissible range Feb 25 02:18:28 61540 kernel: disabling kdump Feb 25 02:44:39 61540 kdump: No crashkernel parameter was specified or crashkernel memory reservation failed Feb 25 02:44:39 61540 kdump: failed to start up
$$ EDIT 2 $$ 或者我應該將程式碼 ro root=LABEL= 更改為 ro root=/dev/sda3?
title CentOS (2.6.18-308.el5) root (hd0,0) kernel /vmlinuz-2.6.18-308.el5 ro root=LABEL=/ crashkernel=128M initrd /initrd-2.6.18-308.el5.img
看起來您將
crashkernel
參數放入新行。這就是消息的原因Kdump is not operational
。所有核心參數必須放在同一行kernel
:title CentOS (2.6.18-308.el5) root (hd0,0) kernel /vmlinuz-2.6.18-308.el5 ro root=LABEL=/ crashkernel=128M initrd /initrd-2.6.18-308.el5.img
重啟後,看一下
/var/log/messages
,你會看到這樣的:localhost kdump: kexec: loaded kdump kernel localhost kdump: started up
和:
# /etc/init.d/kdump start Starting kdump: [ OK ] # /etc/init.d/kdump status Kdump is operational
kdump: No crashkernel parameter was specified or crashkernel memory reservation failed kdump: failed to start up
根據這個文件,試試這個:
crashkernel=128M@16M