About Project Blog Resources

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

How I Hacked Google App Engine: Anatomy of a Java Bytecode Exploit- 493

polybdenumPosted 4 Years Ago

Java App Engine allowed arbitrary Java to be ran within the context of the sandbox. In order to make this usable to developers but still secure, there is a substantial amount of static and dynamic analysis that eventually leads to bytecode rewrites.
Rewriting Java byetcode!? How could this possibility be done securely in order to protect against all possible privilege escalation methods.
When rewriting the bytecode, it is based through ASM for an in-memory serialized representation of it. This means that a parsing difference between ASM and the JVM can allow for bytecode transformations to bypass the security checks.
The first vulnerability is in a difference between Java classfile opcodes. In a ClassFile, older versions used a bytecode of 1,1 and 2 bytes to represent the Code attribute. In newer versions, this is 2, 2, and 4 bytes in size.
By carefully crafting a file, the Classfile can be parsed valid as an older version (with smaller sizes) and the newer version with larger sizes. Using this bug, it would be possible to change interpretation of the bytecode if App Engine didn't have a minimum version of 49.0.
When Java bytecode needs to reference a string, the string data is stored as a two byte length field, followed by that many bytes of string data. ASM does not check for integer overflows on this size though. This means that a string of a very large size would be seen as 0 size by the JVM and the rest of the string would be interpreted as the bytecode.
However, it is impossible to create a string large enough to cause this overflow. So, another way to trigger the parsing error needs to be found.
The strings within ASM are parsed with the MUTF-8 encoding while Java uses UTF-8. The main difference is that NULL bytes are stored as a two byte sequence within MUTF-8.
The vulnerability occurs where ASM will allow for the input to be in the MUTF-8 format and the UTF-8 format. This slight difference between ASM and the JVM can be used to trick the length of a string to trigger the overflow bug above.
The exploit involves using the integer overflow in order to create valid bytecode. Obviously, handwriting bytecode is a tedious and annoying part of this exploit though. The rest of the article dives into how the bytecode was created in order to launch this full exploit.
In order to remediate this vulnerability, App Engine changed the flow to run the ASM again in order to check for discrepancies. By doing this, an entire class of exploits was removed.
This is an insanely long post but I appreciate the thoroughness of the post. Differences in parsers will always be an issue with security.

Maxwell Dulin

About Project Blog Resources

Resources People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

How I Hacked Google App Engine: Anatomy of a Java Bytecode Exploit- 493

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!