Use-After-Free in Python 2.7+

We discussed this vulnerability during Episode 146 on 17 May 2022

Taking an unexpected reference to a memoryview object resulting in a use-after-free when the parent of said object is destroyed. Though this is a rather low impact bug because it requires control over the code being executed, so one could just write an os.system(...) call or something similar. The post is a great writeup on the process of exploiting a use-after-free and has some notes about exploit stability as the author targeted having their exploit run across all versions of Python 3.

By providing a custom File class that extends io.RawIOBase a user could provide a malicious imeplement of the readinto method. This is the method that a BufferedReader will call into to actually read the underlying resource. It passes in a memoryviewwhich is the buffer into which the the contents should be read. If the readinto method takes a reference to this buffer, placing it into a global variable then after the BufferedReader is destroyed, any use of the global variable would result in a use-after-free. The root issue just seems to be not doing a proper reference count, however this bug is not slated to be fixed too soon. It requires someone able to inject arbitrary python, so you could simply use os.system(...) in most cases to gain code execution.

Exploitation - Primitives

First step of the exploitation was just to reclaim the recently freed memory with a controlled object type. They did this by allocating a List of the same size as the buffer, and could consisently reclaim the improperly freed memoryview. With this they now have two ways of accessing the memory, using the originally grabbed reference to the memoryview, this gives them read/write access to all the bytes. And now as part of a list, where Python thinks it is a list of PyObject buffers.

This List is what is used to craft the arbitrary read/write primitive.

Firstly, to know where any python object is in memory the id() function call, will return the address, so ASLR for those objects is a non-issue. Then to get content they control directly in memory they used a bytes object (like a b"string".) The payload of the string would be 32 bytes away from the address of the object itself.

Using this they could craft any fake Python object, place the pointer to the bytes payload into the memoryview and then access the object through the List object that reclaimed the memoryview’s memory. Accessing it through the list will parse the crafted object from the bytes payload. The bytes payload for an arbitrary memory read/write primitive was a bytesarray object. this object would contain a pointer to a backing buffer which could be accessed like any other array. So by crafting a fake bytesarray pointing to the desired memory location, one could read/write any memory.

Exploitation - Chain

Now with the arbitrary read/write the rest was fairly straight forward, but they did take some extra steps for stability. To start they walked memory looking for the ELF header to calculate the CPython base address. Parsed the headers to find the PLT entries and resolve the address for system from libc. Then faked a PyObject with a fake type object that gives them control over various object method/function pointers. Replacing one with the system address and call it for a shell.

Great writeup even if the impact isn’t huge it was a good read.