Fuzzing Yocto Kernel Modules with Syzkaller

This is a sequel to a similarly named blog post Black-Box Fuzzing Kernel Modules in Yocto. In that blog post, I briefly went through what fuzzing is and presented an easy but fairly naive approach to fuzzing. This time I will present a more refined approach to fuzzing using Syzkaller.

Syzkaller

Syzkaller is an unsupervised coverage-guided kernel fuzzer from Google. Simply put, it creates programs that perform different kernel syscalls in varying order with evolving parameters and then analyzes the coverage to see what test programs reach new code coverage. These programs are added to the corpus, which can be used to create even more programs with even larger coverage. Sounds simple? Well, the architecture looks like this:

Could be worse. There are two core components in the system, syz-manager and syz-executor. syz-manager runs on the “main fuzzing server”, and syz-executor runs on the fuzzing target. This can either be an actual hardware device or a QEMU emulator system that syz-manager controls. syz-manager is responsible for the fuzzing work, generating the test programs, and storing the corpus & crashes. The executor receives the test programs, runs them and reports back the results. This communication happens over a network interface.

Smoke Test

A good way to smoke test the setup is to run Syzkaller with the default built-in syscall definitions. This ensures that your image is suitable for fuzzing without having to worry that your additions are breaking things. I’m using Yocto to build an image for x86_64 QEMU. You can use pretty much anything to build the image as long as you have the kernel source and object files, and a disk image for rootfs and of course a kernel. In addition to x86_64 there are plenty of other supported architectures. The setup guide lists arm, arm64, and riscv64 for example.

Building the Image

First, the disk image requirements. The image should have an SSH root login that either has key authentication or a passwordless login. This naturally means that there needs to be networking support. Optional, but strongly recommended, is a DHCP client for getting an IP address. This is required for QEMU port forwarding that syz-manager utilizes. Yocto’s default QEMU core-image-base image does not require any special changes to the disk image as it has network capabilities, passwordless root, and a DHCP client.

The kernel should be built with instrumentation and a few other debugging options enabled. The exact options depend on the target architecture, the x86_64 setup guide lists “at the very least” the following as a requirement:

CONFIG_KCOV=y
CONFIG_DEBUG_INFO_DWARF4=y
CONFIG_KASAN=y
CONFIG_KASAN_INLINE=y
CONFIG_CONFIGFS_FS=y
CONFIG_SECURITYFS=y

The full documentation of the kernel configuration can be found here, and it lists plenty of additional configuration flags. I added both the minimum and the maximum configuration to my meta-fuzzing layer. I’ll be using the minimum configuration.

Writing Syzkaller configuration

Next, we need to write the configuration for Syzkaller. Let’s pick the example QEMU-configuration, and modify it a bit. This configuration defines the web interface address, location of required files, the syscalls to call, and the fuzzing target. We are going to use emulated QEMU targets in this test:

{
	"target": "linux/amd64",
	"http": "<YOUR-IP-HERE>:56741",
	"workdir": "./workdir",
	"kernel_obj": "/poky/build/tmp/work/qemux86_64-poky-linux/linux-yocto/6.6.50+git/linux-qemux86_64-standard-build/",
	"kernel_src": "/poky/build/tmp/work-shared/qemux86-64/kernel-source",
	"image": "/poky/build/tmp/deploy/images/qemux86-64/core-image-base-qemux86-64.rootfs.ext4",
	"syzkaller": ".",
	"disable_syscalls": ["keyctl", "add_key", "request_key"],
	"procs": 4,
	"type": "qemu",
	"vm": {
		"count": 2,
		"cpu": 2,
		"mem": 2048,
		"kernel": "/poky/build/tmp/deploy/images/qemux86-64/bzImage",
		"cmdline": "ip=dhcp"
	}
}

Different configuration items are quite well documented in the manager configuration and QEMU VM configuration files. The amount of virtual machines and processes here is quite low because I have a poor PC for running these tests, so you may want to increase them.

Quite literally what happens when I run Syzkaller with 4 QEMU instances.

This configuration assumes you have the Poky folder directly under the root folder, adjust this if necessary. Also, the configuration assumes that the process is launched in the root of the Syzkaller repo. The ip=dhcp cmdline option is also worth noting. This will be passed to the kernel, and it should ensure that the Ethernet interface uses DHCP to get IP address from QEMU. I think you can also hardcode the IP address if you know what it should be. tcpdump can be used to check the incoming ARP requests to see what the IP address is expected to be. There may be an easier way, but that’s what I did when poking around.

There’s one weird hack I had to do though. In my experience (and I may be wrong here), it seems that Syzkaller expects a certain format from the kernel source tree, and that expected format is not the actual structure of the kernel source. There seems to be an expectation that under the path defined in kernel_src there is a usr/src/kernel folder that points to the source, otherwise the coverage information generation will fail. However, if I move the kernel source to a usr/src/kernel folder, the report generation will fail because the scripts folder cannot be found anymore. To create a suitable directory structure, I used the following script:

cd <KERNEL_SOURCE>
mkdir -p usr/src/
cd usr/src/
ln -s ../../. kernel

Again, I’m not sure if this is necessary, it may be that I misconfigured something, but I had to do this to get the syz-manager running, reporting coverage, and formatting error reports.

Running Syzkaller

Once the configuration is ready, we can start to wonder what to do with it. The first thing is fetching the Syzkaller source from GitHub. Then we can compile Syzkaller using syz-env that is a Docker script that can be used to ensure that the build environment is the expected one:

./tools/syz-env make

This of course requires Docker. You can also use make directly, but then you need to take care of the build dependencies yourself. If you’re doing cross-arch testing, you need to define the target variables, so check out the setup guide for those. Once the build completes, copy the configuration created in the previous chapter to the root of the Syzkaller repo, and run:

./bin/syz-manager -config <CONFIG_FILE>

syz-manager should start up, and after a moment the QEMU machines should boot up as well. If you navigate your web browser to http://<YOUR-IP-HERE>:56741, you should see the web interface that shows the status of fuzzing, collected coverage and crashes:

If not, you can try adding -debug option to the syz-manager command to see what’s going wrong. Also, for some reason my web interface doesn’t load if there are no crashes, there’s just an error message about missing crashes folder. So, I had to create manually ./workdir/crashes folder, and then it seemed to work just fine.

Once you get to the web interface, you can leave the fuzzer running for a while to see how it works and explores new coverage paths while increasing corpus. It’s quite unlikely it’ll find a crash, as it’s fuzzing against a mainline kernel, but if it does, you have potentially found a kernel bug!

Adding Custom Syscalls

However, it’s not very interesting to fuzz the mainline kernel. It’s been so done (and is constantly being done). What’s more interesting is fuzzing our custom kernel module, and seeing if the Syzkaller can find a poorly hidden error from it.

I asked ChatGPT to write a small driver that has an IOCTL interface, and then I added a bug to it. The single command in the interface takes a string as an input. It tokenizes the string, and if the string contains five commas, an invalid free will be performed. There are some extra checks to guide the fuzzer towards the crash.

In-tree vs. Out-of-tree Module Build

While writing this text, I had to consider the in-tree vs. out-of-tree kernel module building. The difference is that an in-tree module is built as a part of the kernel build, and an out-of-tree module is built against the kernel headers after the kernel has been built. The out-of-tree method allows compiling modules for pre-built kernels, assuming the headers are available. For example, if you’ve ever built a Hello World -module for Ubuntu, you’ve most likely run apt-get install linux-headers-`uname -r`. This means you’ve pulled the development headers for building the out-of-tree module, and you’re not compiling an entire kernel.

Since Yocto builds the kernel it is quite easy to build the modules in-tree. This has a few advantages. For example, the module can be easily built as a built-in feature. From the fuzzing point of view, obtaining the coverage for the in-tree modules is a lot easier. Also getting the correct line numbers for crash reports is a lot simpler because there’s no need to decrypt offsets with objdump. So I’d recommend building the modules in-tree for fuzzing.

The example driver has both a module recipe (because that’s what I tried first) for out-of-tree build and linux-yocto append for in-tree build. It’s a bit silly way of supporting both methods, but at least it works (until it doesn’t). To add the IOCTL example module to the in-tree kernel build, add this line somewhere in your configuration:

IOCTL_STRING_PARSE_INTREE = "1"

Describing New Syscalls in Syzkaller

This is where the magic happens. Defining our own syscalls to Syzkaller so that it knows about the non-default syscalls and can create fuzzing sequences utilizing them. To achieve that, we need to write the syscall definitions in Syzkaller’s syntax, which is kind of simple, but still a bit frustrating to get right. Fortunately, there are plenty of examples. After staring at those, and reading this blog post, here’s what I came up with:

include <linux/fcntl.h>
include <linux/ioctl_string_parse.h>

resource fd_vuln_ioctl[fd]
openat$ioctl_string_parse(fd const[AT_FDCWD], file ptr[in, string["/dev/ioctl_example"]], flags const[0x2], mode const[0x0]) fd_vuln_ioctl
ioctl$IOCTL_STRING_PARSE_CMD(fd fd_vuln_ioctl, cmd const[IOCTL_CMD_PARSE_STRING], arg ptr[in, string])

A file with this content should be added to <PATH_TO_SYZKALLER>/sys/linux. The file name can be arbitrarily chosen, but it should have the .txt suffix.

The includes here are from the kernel source tree. fcntl.h is for the AT_FDCWD macro, and our own ioctl_string_parse.h is for the IOCTL_CMD_PARSE_STRING that contains the IOCTL command to use when making IOCTL calls to our driver. resource defines the file descriptor resource that is shared between the two other calls. openat opens the ioctl_example device in read/write mode with no special mode flags. This call should be quite static as the goal isn’t to fuzz the openat command, so most values are constants.

The final definition is the IOCTL command on the opened device, using the IOCTL_CMD_PARSE_STRING defined in the header as the command, and a random string as an argument for the IOCTL call.

Once the definitions are done, it’s time to compile them into Syzkaller. We first need to compile a tool called syz-extract, then extract syscalls from the our .txt file to a .const file, update the generated code, and re-compile the binary. The four commands below do exactly that (for 64-bit Linux systems)

./tools/syz-env make bin/syz-extract
./bin/syz-extract -os linux -arch amd64 -sourcedir /poky/linked-linux-src/usr/src/kernel -builddir /poky/build/tmp/work/qemux86_64-poky-linux/linux-yocto/6.6.50+git/linux-qemux86_64-standard-build <CUSTOM_DEFINITIONS>.txt
./tools/syz-env make generate
./tools/syz-env make

Note how you need to pass the build and source directories for extracting syscalls into .const file that will be used in the code generation. Again, update paths and file names if necessary

After that, we should be almost good to go. The configuration file still needs some tweaking. We do want to focus only on our custom syscalls, and we do not want to mutate the openat command. The following configuration should work:

{
	"target": "linux/amd64",
	"http": "<YOUR-IP-HERE>:56741",
	"workdir": "./workdir",
	"kernel_obj": "/poky/build/tmp/work/qemux86_64-poky-linux/linux-yocto/6.6.50+git/linux-qemux86_64-standard-build/",
	"kernel_src": "/poky/build/tmp/work-shared/qemux86-64/kernel-source",
	"image": "/poky/build/tmp/deploy/images/qemux86-64/core-image-base-qemux86-64.rootfs.ext4",
	"enable_syscalls": ["openat$ioctl_string_parse", "ioctl$IOCTL_STRING_PARSE_CMD"],
	"no_mutate_syscalls": ["openat$ioctl_string_parse"],
	"syzkaller": ".",
	"procs": 8,
	"type": "qemu",
	"vm": {
		"count": 2,
		"cpu": 2,
		"mem": 2048,
		"kernel": "/poky/build/tmp/deploy/images/qemux86-64/bzImage",
		"cmdline": "ip=dhcp"
	}
}

After that, the fuzzer can be started with the same command as before:

./bin/syz-manager -config <CONFIG_FILE>

Now, after waiting a few minutes, there should be some crashes visible:

Sometimes the report may get corrupted. For such crashes C repro code cannot be generated, but the logs may still yield some useful information.

Interestingly enough, if we are using the maximum debug kernel configuration these get reported as “potential deadlocks”. I guess enabling 40+ debug flags has some side effects. (As a side note, after enabling all the configuration items the basic runqemu machine boot time slows down from 10 seconds to 5 minutes). Regardless of which configuration we’re using, if we check out the report we can see the expected root cause for the crash:

It’s quite fascinating to watch the coverage information and see how the fuzzer approaches the problematic line when it attempts to increase the coverage. Fascinating in the same sense it’s exciting to watch paint dry:

The number on the left shows how many items in the corpus reach the line. It can be seen that over time new items can get further in the while-loop. In the perfect example two programs would reach line 208 and not just one, but it’s difficult to get perfection with randomness.

Extra Bonus Issue

Can you see what’s the issue with this code:

input_buffer = kmalloc(input_len, GFP_KERNEL);
switch (cmd) {
    case IOCTL_CMD_PARSE_STRING:
        ret = copy_from_user(input_buffer, (char *)arg, input_len);
        if (ret != 0) {
            printk(KERN_ALERT "Failed to copy string from user space\n");
            kfree(input_buffer);
            return -EFAULT;
        }

        // Some processing happens here

        kfree(input_buffer);
        break;

    default:
        printk(KERN_ALERT "Invalid IOCTL command\n");
        kfree(input_buffer);
        return -EINVAL;
}

I didn’t, and neither did ChatGPT when it suggested this for the first version of the driver. However, when Syzkaller calls this kind of IOCTL function multiple times in a rapid fashion it results in some unexpected invalid-frees. I guess constantly performing allocations and frees in an IOCTL command isn’t the best idea. The second implementation of the driver uses a memory pool to avoid having to allocate memory after initialization, and that seems to work. But yeah, another point for fuzzing for finding a bug from a seemingly functional code.

But Wait, There’s More!

In addition to this, I also created a test program with ChatGPT for debugging purposes. Later I realized that this could also be used for black-box fuzzing the example kernel module. So, if you want to, you can run the following script to fuzz the module with Radamsa (assuming you have installed Radamsa, check the black-box fuzzing blog text for more info):

while true; do 
    test-program-ioctl "$(echo ,,, | radamsa)";
done

Mandatory Final Chapter

This should cover all for now. With these instructions, you can hopefully fuzz your kernel module with Syzkaller. However, this only performs fuzzing on virtual QEMU targets. Sometimes it’d be better to fuzz on the actual hardware, especially if using specialized hardware. I’ll cover that in a follow-up text. If you want to get notified when that text goes out, consider joining my mailing list. Thanks for reading, and happy bug-hunting.

Share