[Zoom] Remote Code Execution with XMPP Stanza Smuggling

We discussed this vulnerability as part of our weekly podcast on 31 May 2022

This is a cool trick, using a UTF-8 parser differential between the client XML parsing library (Gloox) and the server side (fast_xml), to smuggling in characters that would end an XML tag prematurely and smuggle in new XML content.

The core issue comes down to how the \xEB character was treated. In UTF-8 this character represents the start of a three byte sequence (I assume you can probably do the same attack with the markers for longer sequences too). The client would kinda just ignore the UTF-8 aspect of this, its just another byte that gets reflected in the tag name. So it would scan the characters after a < looking for > byte. The server however had a better understanding of UTF-8, it would see the \xEB and would skip over the next few bytes knowing that it belonged to that character sequence. So the client side would stop the tag when it sees > even if its immediately after the \xEB and the server would see it as part of the tag name (enough those its technically an invalid UTF-8 sequence, the server was somewhat permissive)

So with this desync and the ability to smuggle in tags in the attack would inject a <? xml ?> tag which would restart the entire XML context allowing any of the special server commands to be sent. Including one that would cause the client to disconnect from the server and reconnect to an attacker defined server. From there an update could be issued for code execution on the client.