Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Breaking Down Multipart Parsers: File upload validation bypass- 1538

Andrea MeninPosted 1 Year Ago
  • multipart/form-data is used for forms that include binary data, which can be broken into multiple parts. Each part has a boundary string (declared in the actual requests headers) that contains its own headers. The Content-Disposition sub header is used to define parameter name and filename content of the request. Content-Type is used to specify the media part of the content like a normal header as well.
  • While creating a WAF in Lua, they noticed it was super easy to bypass the parsers for this. In the end, no parsers follow the RFC was well as they should. File validation is important for file extension checks, MIME type validation, size limitations and several other things. The rest of the article is bypasses for tricking WAFs and servers.
  • In practice, the application/x-www-form-urlencoded can be used as the content of a multipart/form-data. Many WAFs do not support the multipart/form-data and will effectively ignore it. Since the WAF can't handle it by the server can, URL encoded data will be decoded on the backend but not by the WAF itself, giving a difference between check and usage. This was true of HAProxy, AWS WAF and AWS Lambda.
  • The next bypass is the handling of duplicate parameters. Both name and filename can be duplicated - if this happens, one may parse the first while the other may parse the second. This is also true with the full header data.
  • CRLF Parsing seems inadequate in many cases too. Some only delimit from \r\n\r\n while others will just use \r. Single quotes on parameter names instead of double quotes causes a similar effect. In PHP, if the closing boundary string is missing, it will parse fine while other things will not.
  • The final bypass, and the authors favorite, has to do with a RFC update. With the update, filename* parameters allow for special characters and the ability to specify an encoding. For instance, filename*=utf-8''filename.pdf if s valid parameter. In practice, this allows for URL encoding the filename information which most WAFs are not going to do. They give an example of PHP file validation.
  • The second half of the article are bypasses that they found in various engines. When running this on OpenResty with Lua in front, literally all of the bypasses above worked. In the case of the Nodejs library Busboy, it's super permissive. The filename parameter with and without the encoding supported created different priorities locally than in most servers, creating an easy bypass.
  • In Flask, the trick for Busboy also works. Additionally, if separator of a semicolon is left out (only a space) in a header as a delimiter then Flask won't parse it either. Finally, duplicate disposition headers will use the first instead of the expected second. This was also true on FortiWeb WAF.
  • The issue isn't a single implementation - the issue is that all of the different implementations do different things. By combining these parsers, similar to HTTP smuggling, we can bypass security protections. There are going to be more and more bug classes like this in the future for things that have 10's of parsers and are complicated. Good write up!