From Fedora Project Wiki

Fedora Core 6 Kdump Kexec HowTo

Introduction

Kexec and kdump are new features in the 2.6 mainstream kernel. Major portions of these features are now in Fedora Core 5 and later releases. The purpose of these features is to ensure faster boot up and creation of reliable kernel vmcores for diagnostic purposes. More information on the kexec/kdump project can be found at http://lse.sourceforge.net/kdump/.

Overview

Kexec

Kexec is a fastboot mechanism that allows booting a Linux kernel from the context of an already running kernel without going through BIOS. BIOS can be very time consuming, especially on big servers with numerous peripherals. This can save a lot of time for developers who end up booting a machine numerous times.

Kdump

Kdump is a new kernel crash dumping mechanism and is very reliable. The crash dump is captured from the context of a freshly booted kernel and not from the context of the crashed kernel. Kdump uses kexec to boot into a second kernel whenever the system crashes. This second kernel, often called a capture kernel, boots with very little memory and captures the dump image.

The first kernel reserves a section of memory that the second kernel uses to boot. Kexec enables booting the capture kernel without going through BIOS hence the contents of the first kernel's memory are preserved, which is essentially the kernel crash dump.

Currently, the standard kernel and capture kernel (kernel-kdump) are two different entities, but work is underway to make the standard kernel relocatable (within memory), and thus usable as a capture kernel, eliminating the need for a separate kdump kernel. This feature is currently targeted for delivery as part of Fedora Core 6's General Availability (GA) release.


How to configure kexec

1. Install kexec-tools:

yum install kexec-tools

1. Load a kernel with kexec:

kver=<code>uname -r</code>
kexec -l /boot/vmlinuz-$kver --initrd=/boot/initrd-$kver.img \
--command-line="<code>cat /proc/cmdline</code>"
The above command boots the system back into the currently running kernel. To load a different kernel, substitute it in place of uname -r.

1. Reboot the system, taking note that it should bypass the BIOS:

reboot

How to configure kdump

1. Make sure kernel-kdump, kexec-tools, and crash are installed:

yum install kernel-kdump kexec-tools crash

1. To do a postmortem debug analysis, install the kernel-debuginfo package:

yum --enablerepo=\*debuginfo install kernel-debuginfo

1. Modify some boot parameters to reserve a chunk of memory for the capture kernel. For i386 and x86_64, edit /etc/grub.conf, and append crashkernel=128M@16M to the end of the kernel line.

This is an example of /etc/grub.conf with the kdump options added:

#
#boot=/dev/hda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Fedora Core (2.6.17-1.2570.fc6)
root (hd0,0)
kernel /vmlinuz-2.6.17-1.2570.fc6 ro root=/dev/VolGroup00/root crashkernel=128M@16M
initrd /initrd-2.6.17-1.2570.fc6.img

This is an example of /etc/yaboot.conf (for ppc64 machines only) with the kdump options added:


boot=/dev/sda1
init-message=Welcome to Fedora Core!\nHit <TAB> for boot options

partition=2
timeout=80
install=/usr/lib/yaboot/yaboot
delay=5
enablecdboot
enableofboot
enablenetboot
nonvram
fstype=raw

image=/vmlinuz-2.6.17-1.2570.fc6
label=linux
read-only
initrd=/initrd-2.6.17-1.2570.fc6.img
append="root=LABEL=/ crashkernel=128M@32M"

1. After making the above changes, reboot the system. The 128M of memory (starting 16 or 32M into the memory) is left untouched by the normal system, reserved for the capture kernel. Take note that the output of free -m shows 128M less memory than without this parameter, which is expected.

It may be possible to use less than 128M, but testing with only 64M has proven unreliable.

1. Now that the reserved memory region is set up, turn on the kdump init script:

chkconfig kdump on

1. Start up kdump:

service kdump start

1. This should load the kernel-kdump image via kexec, leaving the system ready to capture a vmcore upon crashing. To test this, force-crash the system using sysrq:

echo "c" > /proc/sysrq-trigger

This causes the kernel to panic, followed by the system restarting into the kdump kernel. When the boot process gets to the point where it starts the kdump service, the vmcore should be automatically copied, from /proc/vmcore, out to disk (by default, to /var/crash/<YYYY-MM-DD-HH:MM>/vmcore). The system then reboots back into the normal kernel.

1. In the normal kernel, the previously installed crash kernel can be used in conjunction with the previously installed kernel-debuginfo to perform postmortem analysis:

crash /usr/lib/debug/lib/modules/2.6.17-1.2570.fc6/vmlinux /var/crash/2006-08-23-15:34/vmcore

crash> bt

Details of using crash for postmortem analysis are outside the scope of this article.