iTLB multihit

Original Post:
iTLB multihit
We discussed this vulnerability during Episode 222 on 07 November 2023

iTLB Multihit exploits a low-level issue where instruction fetches from the instruction Translation Lookaside Buffer (TLB) can hit multiple entries for different page sizes. It’s suspected this bug is caused by electrical corruption due to data pins from different TLBs on the die being driven on the same line. If one of these pins was for parity bits, this corruption can trigger a Memory Check exception (#MC). What seemed to work best was changing from a smaller page size to a larger page size, which would reduce the amount of TLB entries for a given range and add a new entry, which increased the chances that an entry could be in multiple TLBs simultaneously.

What makes this notable is this can be triggered from a guest VM as guests have control over page table entries and thus TLB entries, which can allow attacker controlled guests to crash the host machine. Second Level Address Translation (SLAT) is generally not impactful in preventing this issue, as typically the second level page tables use large page sizes (>4KB), and the smaller page size between the guest and host page tables is what’s stored in the TLBs. SLAT can be used to mitigate the issue though by setting the No eXecute (NX) bit on large pages and having the hypervisor handle NX exceptions to split large pages into 4KB pages, and eliminate the ability to trigger a multihit via going to larger page sizes.