Exploiting null-dereferences in the Linux kernel

We discussed this vulnerability during Episode 182 on 24 January 2023

The last time we covered a “how to exploit a null-deref in the modern era” post we were…disappointed (and potentially attacked by North Korea but that’s another story), this one is legit. Rather than focusing on the null-deref as the core memory corruption though, it abuses the handling of the null-dereference with a kernel oops and the side-effects of the oops to overflow a reference count.

Effectively the insight here is that a kernel oops, is a way for the kernel to attempt to recover from an issue without crashing. When an oops happen, the task that caused it will be killed off, and important cleanup code that might usually happen won’t be executed. For example, locks won’t be unlocked, reference counts that were taken won’t be returned, memory can remain allocated.

So, this exploit took a null-dereference in show_smaps_rollup (triggered by reading /proc/[pid]/smaps_rollup on a task without no virtual memory address). The corresponding kernel oops would potentially fail to return three reference counts. The one of interest mm_struct’s mm_users reference count would be leaked, and as it was a 32-bit counter could reasonably be overflowed. And a reference count overflow can be used to trigger a use-after-free.

The next steps were just to trigger the kernel oops 2^32 times to overflow the reference count, and do so without leaking too much memory every time. You might be wondering how long this might take to exploit.

On server setups that print kernel logging to a serial console, generating 232 kernel oopses takes over 2 years. However on a vanilla Kali Linux box using a graphical interface, a demonstrative proof-of-concept takes only about 8 days to complete!