Debugging the Linux kernel with qemu and gdb

Recently I wanted to run the Linux kernel under a debugger to understand the finer points of the networking code. It's easy to do this using qemu's gdb support, but the the details you are scattered in various places. This post pulls them together.

You can debug the kernel in the context of a full VM image. But qemu provides a more convenient alternative: You can give the guest kernel access to the host filesystem (this uses the 9P remote filesystem, running over the virtio transport rather than a network). That way, we can make use of binaries we have lying around on the host system.

First, we have to build the kernel. Of course, in order to use binaries from the host system, the architecture should match. And to be able to explore the running kernel, gdb needs debug information, so your .config should have:

CONFIG_DEBUG_INFO=y

For filesystem access, you'll need virtio and 9P support:

CONFIG_VIRTIO=y
CONFIG_VIRTIO_PCI=y
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
CONFIG_9P_FS=y

Other than that, the kernel configuration can be bare-bones. You don't need most device drivers. You won't need kernel module support. You won't need normal filesystems (just procfs and sysfs). So you can start from the default kernel config and turn a lot of things off. My .config for 3.17rc5 and x86-64 is here.

If we leverage the host filesystem, we are now ready to launch the kernel under qemu and gdb. I'm using qemu-1.6.2 and gdb-7.7.1 from Fedora 20. Start qemu in one terminal window (as an ordinary user, you don't need root for this) with:

$ qemu-system-x86_64 -s -nographic \
        -kernel kernel tree path/arch/x86/boot/bzImage \
        -fsdev local,id=root,path=/,readonly,security_model=none \
        -device virtio-9p-pci,fsdev=root,mount_tag=/dev/root \
        -append 'root=/dev/root ro rootfstype=9p rootflags=trans=virtio console=ttyS0 init=/bin/sh'

Here:

  • The -s option enables gdb target support.
  • The -kernel option boots the specified kernel directly, rather than going through the normal emulated boot process.
  • -fsdev ...,path=/,readonly,security_model=none tells qemu to give read-only access to the host filesystem (see this follow-up for read-write access).
  • The -append option add kernel command line parameters to tell the kernel to use the 9P filesystem as the root filesystem, to use a serial console (i.e. the terminal where you ran qemu), and to boot directly into a shell rather than into /sbin/init.

You should see the kernel boot messages appear, ending with a shell prompt. The qemu console obeys some key sequences beginning with control-A: Most importantly, C-a h for help and C-a x to terminate qemu.

Then in another terminal run gdb with:

$ gdb kernel tree path/vmlinux
GNU gdb (GDB) Fedora 7.7.1-18.fc20
[...]
Reading symbols from vmlinux...done.
(gdb) target remote :1234
Remote debugging using :1234
atomic_read (v=<optimized out>) at ./arch/x86/include/asm/atomic.h:27
27              return (*(volatile int *)&(v)->counter);

The guest kernel is stopped at this point, so you can set breakpoints etc. before resuming it with continue.

A few caveats:

Because we passed init=/bin/sh on the kernel command line, there was no init system to set up various things that are normally present on a Linux system. For instance, the proc and sys filesystems are missing, and the loopback network interface has not been started. You can fix those issues with the following commands:

sh-4.2# export PATH=$PATH:/sbin:/usr/sbin
sh-4.2# mount -t proc none /proc
sh-4.2# mount -t sysfs none /sys
sh-4.2# ip addr add 127.0.0.1 dev lo
sh-4.2# ip link set dev lo up

Another consequence of starting bash directly from the kernel is this warning:

sh: cannot set terminal process group (-1): Inappropriate ioctl for device
sh: no job control in this shell

Due to this lack of job control, you won't be able to interrupt commands with control-C. So be careful that you don't lose your shell to a command that runs forever!

qemu has a -S option which doesn't start the guest until you connect with gdb and tell it to continue, so you can use gdb to debug the boot process. But I've found that doing that with x86_64 kernels tends to trigger a recent bug in qemu's gdb support. (That bug only affects x86_64 guests, so you can avoid it by building the emulated kernel for i386 or another arch. But then you can't share the filesystem from an x86_64 host.)