Implementing a Linux Syscall
Sometimes, issues can only be fixed in a layer below the abstractions you rely on usually. Understanding the whole stack as much as possible is a great advantage when a bug pops up. For this reason, I decided to implement a simple Linux syscall to become familiar with the Linux kernel’s code organization.
That said, I don’t aspire to become a kernel hacker. The new syscall printk simply prints a given null-terminated message into the kernel log. All code changes referenced in this post can be found in this Pull Request.
Clone the kernel code
|
|
Note the branch option allows tags, too.
Preparing the development environment
We will test our custom kernel on a VM to protect our host machine’s stability. In my case, the host runs Ubuntu 20.04. The modified Kernel will be tested in a VM running Ubuntu 18.04.
Install the packages needed to compile the kernel. More information about the packages required for compilation can be found on kernel.org/.
|
|
Install virtualization software for the VM:
|
|
Fetch an Ubuntu cloud image, a compact pre-installed disk images. With this, we don’t have to run the OS install:
|
|
Create the VM:
|
|
Install ccache to heavily reduce build times after the first build. Making changes iteratively and playing around with the code is a lot faster with ccache.
|
|
You can check the number of cache misses and hits with “ccache -s”.
Implementing the syscall
This is the code of the syscall itself (kernel/sys_printk.c):
|
|
The syscall is defined with the SYSCALL_DEFINE1 macro because it takes one argument. The macro’s first argument is the syscall name. The second and third macro arguments describe the type and name of the syscall’s first argument.
Accessing user space memory from the kernel is a bad idea. Therefore, we copy the content of the message passed as argument to the syscall into kernel space with “strncpy_from_user”.
Next, we hook the new file into the build system in kernel/Makefile:
|
|
Extend the system call table
Add an entry for the printk syscall to vim arch/x86/entry/syscalls/syscall_64.tbl
:
|
|
Extend the syscall header file
Add the following line to the syscall header file include/linux/syscalls.h
|
|
Build the kernel
Create the kernel configuration with the following command. You can pick drivers, algorithms and customize the kernel with a huge number of options.
|
|
Next, we build the kernel and create a debian package containing the kernel. We can easily install the debian package in the VM.
|
|
The -j or –jobs option specifies how many makefile recipes can be executed in parallel. A sensible value reduces the compile time a lot. I set the number of jobs to $nproc, the number of CPU cores in my laptop (including hyper-threads). In my case, $nproc is 12.
The CC variable stands for “c compiler”. We use ccache because it caches compilation artifacts which speeds up subsequent builds.
Installing the kernel in the VM
Move the Debian packages into the VM:
|
|
Install the Debian packages in the VM:
|
|
Reboot the VM:
|
|
Check the new kernel is running with:
|
|
Testing the syscall
For testing purposes, we will create a small test program that uses our new syscall. There is no libc wrapper function for the new syscall printk. To call our syscall printk, we use the “syscall” function. It’s arguments are the syscall number specified in the syscall table and the arguments for the syscall.
|
|
Comspile the test program “syscall-printk-test” with “make”. We run the program on our development host to verify it fails as our kernel doesn’t have the new syscall.
|
|
Let’s check the meaning of errno 38.
|
|
As expected, the test program fails because our kernel doesn’t have a syscall with number 440.
Next, we run the test program in our VM with the custom kernel.
|
|
The message was written to the kernel log successfully.
Conclusion
As usual, taking a look under the hood of a technology demystifies it - there is no magic involved. Building and installing the Linux kernel was a lot simpler than I expected. Modifying the kernel was slightly trickier because of all the historical cruft. Surely, having a basic understanding of the kernel’s source code organization will come in handy in the future.