Fedora Core 6 Kdump Kexec HowTo
Introduction
Kexec
and kdump
are new features in the 2.6 mainstream kernel. Major portions of these features are now in Fedora Core 5 and later releases. The purpose of these features is to ensure faster boot up and creation of reliable kernel vmcores for diagnostic purposes. More information on the kexec/kdump project can be found at http://lse.sourceforge.net/kdump/.
Overview
Kexec
Kexec is a fastboot mechanism that allows booting a Linux kernel from the context of an already running kernel without going through BIOS. BIOS can be very time consuming, especially on big servers with numerous peripherals. This can save a lot of time for developers who end up booting a machine numerous times.
Kdump
Kdump is a new kernel crash dumping mechanism and is very reliable. The crash dump is captured from the context of a freshly booted kernel and not from the context of the crashed kernel. Kdump uses kexec to boot into a second kernel whenever the system crashes. This second kernel, often called a capture kernel, boots with very little memory and captures the dump image.
The first kernel reserves a section of memory that the second kernel uses to boot. Kexec enables booting the capture kernel without going through BIOS hence the contents of the first kernel's memory are preserved, which is essentially the kernel crash dump.
Currently, the standard kernel and capture kernel (kernel-kdump) are two different entities, but work is underway to make the standard kernel relocatable (within memory), and thus usable as a capture kernel, eliminating the need for a separate kdump kernel. This feature is currently targeted for delivery as part of Fedora Core 6's General Availability (GA) release.
How to configure kexec
1. Install kexec-tools
:
yum install kexec-tools
1. Load a kernel with kexec
:
kver=<code>uname -r</code> kexec -l /boot/vmlinuz-$kver --initrd=/boot/initrd-$kver.img \ --command-line="<code>cat /proc/cmdline</code>"
1. Reboot the system, taking note that it should bypass the BIOS:
reboot
How to configure kdump
1. Make sure kernel-kdump
, kexec-tools
, and crash
are installed:
yum install kernel-kdump kexec-tools crash
1. To do a postmortem debug analysis, install the kernel-debuginfo
package:
yum --enablerepo=\*debuginfo install kernel-debuginfo
1. Modify some boot parameters to reserve a chunk of memory for the capture kernel. For i386 and x86_64, edit /etc/grub.conf
, and append crashkernel=128M@16M
to the end of the kernel line.
This is an example of /etc/grub.conf
with the kdump options added:
# #boot=/dev/hda default=0 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title Fedora Core (2.6.17-1.2570.fc6) root (hd0,0) kernel /vmlinuz-2.6.17-1.2570.fc6 ro root=/dev/VolGroup00/root crashkernel=128M@16M initrd /initrd-2.6.17-1.2570.fc6.img
This is an example of /etc/yaboot.conf
(for ppc64 machines only) with the kdump options added:
boot=/dev/sda1 init-message=Welcome to Fedora Core!\nHit <TAB> for boot options partition=2 timeout=80 install=/usr/lib/yaboot/yaboot delay=5 enablecdboot enableofboot enablenetboot nonvram fstype=raw image=/vmlinuz-2.6.17-1.2570.fc6 label=linux read-only initrd=/initrd-2.6.17-1.2570.fc6.img append="root=LABEL=/ crashkernel=128M@32M"
1. After making the above changes, reboot the system. The 128M of memory (starting 16 or 32M into the memory) is left untouched by the normal system, reserved for the capture kernel. Take note that the output of free -m
shows 128M less memory than without this parameter, which is expected.
1. Now that the reserved memory region is set up, turn on the kdump init script:
chkconfig kdump on
1. Start up kdump:
service kdump start
1. This should load the kernel-kdump
image via kexec
, leaving the system ready to capture a vmcore upon crashing. To test this, force-crash the system using sysrq
:
echo "c" > /proc/sysrq-trigger
This causes the kernel to panic, followed by the system restarting into the kdump
kernel. When the boot process gets to the point where it starts the kdump
service, the vmcore should be automatically copied, from /proc/vmcore
, out to disk (by default, to /var/crash/<YYYY-MM-DD-HH:MM>/vmcore
). The system then reboots back into the normal kernel.
1. In the normal kernel, the previously installed crash kernel can be used in conjunction with the previously installed kernel-debuginfo
to perform postmortem analysis:
crash /usr/lib/debug/lib/modules/2.6.17-1.2570.fc6/vmlinux /var/crash/2006-08-23-15:34/vmcore crash> bt
Details of using crash for postmortem analysis are outside the scope of this article.