Very long post, covering an old issue (2013) tons of background about Java bytecode, App Engine and ASM (library). Some context for the issue is that App Engine would perform in-process sandboxing. It would allow user’s to write normal Java code, including some dangerous classes like Reflection and custom classloaders. These are dangerous because they can lead to fully arbitrary code execution. The App Engine team used ASM to parse and rewrite the Java bytecode from the user to inject various security checks.
The vulnerability focuses on getting around those checks to call into dangerous functionality. First part the document covers a kinda cool attack on bytecode parsing that ultimately wasn’t relevant for App Engine. The actual issue used though was in how ASM would process strings. In Java strings are stored with two bytes of length and then that many bytes of data (not null terminated). ASM however did no checking on string length before writing it out. So if it thinks a string is 65536 (0x10000) it’ll write out 0x0000. The JVM will see that and then start processing the actual string data and new bytecode and not the string.
The problem was actually providing a workable classfile that would have such a large string. For this the author took advantage of a disconnect between the strings the JVM expected and what ASM would accept. Notably MUTF-8 is used by the JVM which encodes null-bytes as a two (non-null) bytes. ASM would always write out strings with MUTF-8 but it would accept as input strings containing regular null bytes, and rewrite them. This created the ability to craft a string that was appropriately sized in theory but when ASM processed it would overflow.