Out-of-bounds read in cmark-gfm [CVE-2023-22485]

We discussed this vulnerability during Episode 186 on 07 February 2023

Out of bounds read in cmark-gfm due to a lack of bounds check in validate_protocol.

You’ve got the validate_protocol function which is intended to confirm that the data in the markdown string has the same prefix. You’d pass in a the expected protocol, the data, and an offset (called rewind) which gives the offset to start the comparison at. It iterates backwards from the offset comparing each character of the protocol to the character in data. If the protocol string is longer than the number of characters in the data buffer before the : character triggering the check, the code will simply continue comparing characters out of bounds.

Giving an out of bounds read, althought the exploitability is fairly unlikely, it provides out of bounds read of the heap metadata stored before the buffer, but as the function simply returns false unless that byte happens to match the expected character its not leaking much useful information or doing anything that could be reasonably exploited.

Its still a weird bug and some non-intuitive code.