A recent vulnerability in PHP seemed like a good test for variant analysis in other systems. The vulnerability is an integer truncation and sign conversion bug that via an implicitly converts the value. They found a few bugs from this but found an interesting one within SQLite.
The unsigned long
(2*ZSTR_LEN(unquoted) + 3) is passed to
sqlite3_snprintf as the first parameter, which expects a signed integer. The line of code in question is shown below:
sqlite3_snprintf(2*ZSTR_LEN(unquoted) + 3,
quoted,
"'%q'",
ZSTR_VAL(unquoted));
Why is this bad? The truncation and conversion has an interesting effect - the meaning of the number can be changed! For instance, a large unsigned number could turn into a negative signed number if not handled correctly. While writing up a quick proof of concept for this bug, they quick ran into an ASAN crash with MySQL in an entirely different location.
SQLite has custom format strings, such as %q. While trying to execute the code above, they had ran into an implementation issue on one of the format string handlers. This vulnerability was a linear stack based buffer overflow resulting from an integer overflow. This overflow comes from a really simple addition (+3) being done on a variable without checking to see if this could somehow overflow.
With this wildcopy type of crash, they wanted to find a way to write only the necessary bytes while making the overflow still happen. By adding a single quote (') and not finishing out the string, an inner loop could run that would input a ton of data but only output a limited amount. This allows for the overflowed size to be controllable AND the amount of bytes being output to also be controllable. Pretty neat trick! This allows for an overwrite of the RIP of the stack but we would need an information leak to leak the stack canary for this to work, sadly.
SQLite has 100% branch test coverage and has fuzzing setup on it. How was this missed!? There is an internal memory limit of 1GB for SQLite itself. However, external programs, such as calling from the C API, don't have this limitation. By providing astronomically large values, it was possible to hit this. Additionally, 100% branch coverage means every line of code was hit but NOT how hard or how many times. I really appreciated this insight from Trail of Bits here.
Overall, awesome and thorough post on the vulnerability and how they found it.