126 - Dirty Pipe and Analyzing Memory Tagging
Background
The root cause of Dirty Pipe was a linux kernel bug introduced in the pipe subsystem in a 2016 commit. Due to various changes in the kernel over time, this bug became a critical security issue. A bit of background is needed on how pipes work on the kernel side, particularly anonymous pipes. Pipes are implemented using a ring buffer containing pipe_buffer
objects. These objects contain flags as well as other things such as a reference to the backing memory page for the data to be stored. In normal cases, when you write to a pipe, a page is allocated and the data is written there. If you then write to a pipe again, the kernel will try to append that data to the existing page if there’s room before allocating a new page for performance reasons. However, it’s possible for a pipe_buffer
to contain a reference to a page it doesn’t actually own, being the splice()
system call.
The splice()
system call will create a pipe_buffer
entry that references a page in the page_cache
for the file being spliced in. Due to the pipe not having ownership of this page, obviously it cannot allow the user to append data to that page. The kernel needs to account for this and track which buffers are “mergable” and which aren’t. Until a commit in 2020, this “mergable” trait was tracked with a field called can_merge
. In 2020, it was changed to reference the pipe_buffer
’s flags
field, and the PIPE_BUF_FLAG_CAN_MERGE
flag was introduced.
The bug
A commit in 2016 added two functions to the pipe subsystem, push_pipe()
and copy_page_to_iter_pipe()
. These functions are used by the splice()
system call to allocate pipe_buffer
entries for the backing file data. The problem is, it never initialized the flags
field. At the time, this was a non-security issue because even though the flags could be used in an uninitialized manner, the flags weren’t used in any critical context. When the 2020 commit landed and made use of that flags field, memory corruption was introduced which made it possible for splice()
-allocated pipe buffers to have the “mergable” trait. An attacker can intentionally poison the ring to set the PIPE_BUF_FLAG_CAN_MERGE
on pipe buffers used by splice()
, which then allows them to write data into the page cache which the pipe doesn’t own.
This can facilitate the ability to write and change the file data in the page cache even if it’s opened as read-only, giving a privilege escalation primitive.
The vulnerability here is just a straight forward case of reading a size from the attacker, and using it in a memcpy
into a fixed size destination buffer on the stack.
A little bit interesting in this case was the exploitation strategy used. While nothing ground breaking normally the only primitive we’ve covered being gained from the stack-based overflow is hijacking the stored return address between stack frames. This exploit used the stack-based overflow to get a couple other primitives first by corrupting the locals on the stack.
- First the they were able to brute force a
client_sock
value. When the file descriptor used here was invalid the function just returned so it was simple to just try a value and keep incrementing until it worked. - With the
client_sock
leaked, there was theprefix_ptr
andprefix_size
values which provided an arbitrary read primitive asprefix_size
bytes would be read fromprefix_ptr
and sent out over the socket and then freed. The free was an important constraint on where theprefix_ptr
could point to, but otherwise it was arbitrary. They pointed it at the Global Offset Table (which for some reason worked despite thefree
call) to leak libc function pointers. With that they could calculate the address ofsystem
- The next step with
system
leaked was to get data they control in a consistent location. This turned out to be fairly easy just requiring an allocation large enough to getmmap
‘d. Which could be caused just by sending a large enough HTTP request in the first place. - And finally they had all the pieces necessary to use a more traditional route with a ret2lib style attack.
While nothing ground breaking, especially for those new to the field I often see stack based overflow basically just meaning overwrite ret, and they really can be much more powerful than just a control flow hijack.