Show Notes

80 - Escaping the Bhyve, WhatsApp & BrakTooth

The title pretty accurately describes this issue, there is little to no security implemented within Honda and Acura keys/remotes. An attacker can simply capture and then replay it at a later time to the vehicle. This includes lock/unlock commands, opening the trunk, windows, or even starting the vehicle depending on the abilities of the remote.

As far as attacks go the author admits, this is not unique. It is however rather surprising. This is not some obscure attack, vehicle manufactures have been using rolling codes for precisely this reason. Heck, even many garage door openers use a rolling code system to prevent this sort of simple replay attack.

This post covers a heap overflow in the InnoDB memcached plugin for MySQL. The “get” command implementation first tokenizes the key-value pairs then fetches them. If one of the keys specified in the “get” command is of the format “@@containers.name”, the table name gets copied into the row buffer at the current cursor of the buffer via memcpy(). While there is an assert for bounds checking, asserts are only used in debug builds, meaning production builds effectively have no bounds checking. This leads to the being able to achieve out of bounds write through the row buffer.

Patch The assert was removed and proper bounds checking code was added above the memcpy(). If the table length name added to the cursor exceeds REC_BUF_SLOT_SIZE, it will limit the record size to 16MiB, and the cursor will get reset to prevent overflow.

WhatsApp has the ability for users to apply filters on images. The way these filters work is they take a “source” image, apply transformations on the underlying pixel data, then save the new image. After fuzzing, the authors discovered a crash when switching between filters on crafted GIF files. After some root causing and reversing, they determined the vulnerability was ultimately in applyFilterIntoBuffer() from the WhatsApp library.

An out-of-bounds access can occur when iterating through the source image pixels to apply transformations on them. This is because the WhatsApp developers assumed the source and destination image both have the same pixel format (or “stride”) as well as the same dimensions, but these assumptions are never verified. By sending a malicious source image with a stride of 1 byte per pixel instead of the expected 4 bytes per pixel, the function will attempt to copy 4 times more data from the source image than it should.

This attack would be fairly complex to pull off against a victim, as you would have to send a victim a malicious source image, have them apply a filter on it, then get them to send it back to you. None the less, WhatsApp took it seriously and responded well to the report.

Patch The patch consisted of two steps. First, it now ensures the stride is the expected 4 bytes. Second, it also validates the image size using the stride to ensure the image has exactly 4 bytes per pixel before parsing.

Bhyve is FreeBSD’s type-2 hypervisor. The author of this GitHub security advisory discovered 6 bugs that can lead to a VM escape in various drivers, and all of them are essentially the same issue in different places. Various drivers call vq_getchain() to fill an iovec object with memory ranges the guest had previously setup for virtio queues. This function can fail though if the guest never setup any virtio queues to use. Many functions that call vq_getchain() do not check it’s return value (or check it improperly), and end up using an uninitialized iovec object for writing.

The following functions all contain this bug:

  • pci_vtrnd_notify()
  • pci_vt9p_notify()
  • pci_vtcon_sock_rx()
  • pci_vtscsi_controlq_notify()
  • pci_vtscsi_requestq_notify()

The pci_vtcon_notify_tx function contains a very similar bug, where it tries to check the return value but it does so incorrectly. It stores the return value as a uint16, which is problematic because the function returns a signed integer. When it returns -1 on failure, it results in being read as a positive uint16 instead of the expected error value.

Synaktiv ended up investigating the Western Digital Pro PR4100 when looking at the target list for pwn2own tokyo 2020. When looking at this device, they took particular interest in the webserver, and reversed the cgi-bin that implemented it. One of the functions they looked at was the wd_login() function, which would take a user-provided username and base64 encoded password (which later gets compared against the /etc/shadow file). This function ended up containing 2 bugs which could be chained together to gain code execution.

Bug 1 - Base64 decoding across two buffers in wd_login() What was strange was the fact that wd_login() would take a 256 byte base64 password, which would allow 192 bytes of the decoded password. This was a red flag, because the buffer for the decoded password was only 64 bytes on the stack. This in itself was a bug, because it lead to the decode routine writing into the next adjacent buffer, being the base64 password buffer itself. This bug was not immediately useful, but was helpful for exploiting another more critical bug.

Bug 2 - Heap overflow in do_auth_with_shadow() When copying the decoded password into a local 120 byte stack buffer, do_auth_with_shadow() would use strcpy(), and did no bounds checking on the length of the string. Because of the first bug which allowed the decoded string to flow across two buffers, it allowed for more than 120 of continuous bytes without a null terminator, leading to stack overflow. Because there’s no canaries built into this binary, this easily allows arbitrary code execution via hijacking the return address.

Exploitation While triggering code execution was fairly straightforward, actually abusing it turned out to be tricky. While the login_mgr binary was not position independent and was a 32-bit binary, libraries like libc are 64-bit and are position independent. They were also restricted by not being able to place any null bytes in the first 120 bytes of the payload, since that would fail to trigger the bug. The upper 32-bits of the RIP overwrite also needed to be null for the address to be a valid address. They needed a one-shot exploit using gadgets found in the login_mgr binary. Ultimately it came down to abusing functions that called system(), and getting control over the RDI register to specify arbitrary commands to it.

They discovered a range of bytes that could be completely user controlled on the stack, being $rsp+0x0110 to $rsp+0x0150. This range is after the return address overwrite, so the no-null byte restraint wasn’t required. It could also be completely controlled by the attacker with the ability to write null bytes, because the data was written there via base64 decode. They used two stack pivots to exploit the vulnerability. The first one shifted the stack pointer to point into the completely controlled range. The second pivot was used to call the system() gadget.

WD’s Alleged Bad Faith Synaktiv briefly notes at the end that this vulnerability was patched 2 days before the end of registration for pwn2own, hinting that they think Western Digital intentionally patched this so close to the deadline. They further note that Western Digital’s bad faith toward security researchers is “notorious”, and some researchers don’t bother reporting to them anymore for that reason. Finally, they state some interesting observations with how similar the code in WD NAS devices and D-Link NAS devices are, stating someone else found a very similar issue to this one in a D-Link product. They suspect this is because D-Link offloaded their firmware code to Western Digital after they winded down their NAS product division.

This paper talks about a hybrid fuzzing approach for fuzzing hypervisors or virtual CPUs, which they call “HyperFuzzer”. It combines dynamic symbolic execution as well as coverage guided fuzzing to fuzz virtualization platforms like Hyper-V. Their observation on which they base HyperFuzzer is that “A Virtual CPU’s execution is determined by the VM’s state, not by the hypervisor’s internal state”. To build on that observation, they use a full VM state for fuzzing virtual CPUs. They also mutate instructions going to the virtual CPU as well as mutating the architectural state.

The key novelty that HyperFuzzer brings to the table is what they call “Nimble Symbolic Execution”, which uses Intel Processor Trace (PT) for recording execution traces to use for symbolic execution. Their setup consists of greybox fuzzing via AFL, and a custom rolled whitebox fuzzing solution with SMT solving, which they integrate into AFL’s fuzz loop. Figure 6 on page 6 shows a pseudo-code snippet of this in action.

Evaluations They found their fuzzing solution performed about 3x faster using their setup than using existing hardware emulation solutions like bochs. Further, they found 11 previously unknown bugs in Hyper-V, 6 of which were critical as they could be used for VM escapes. Their coverage also showed that they generally got much deeper coverage with a hybrid solution compared to just a greybox or whitebox solution in isolation, coming on top when measured against Hypercalls, APIC emulation, and MSR emulation. Task switch seems to be the only area where the hybrid solution wasn’t on top, where it was slightly below graybox fuzzing.

Limitations The biggest limitation for using this setup is the fact that it relies on Intel PT. This makes it impossible to use to fuzz AMD-based virtualization code. It also doesn’t support multiple virtual CPUs for whitebox fuzzing, which makes it more difficult to find race condition type issues. This would require a new technique for doing the tracing and symbolic execution, so they’ve noted that’s been left for future work.