About Project Blog Resources

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Confusion Attacks: Exploiting Hidden Semantic Ambiguity in Apache HTTP Server! - 1467

Orange Tsai Posted 1 Year Ago

The Apache HTTP server is constructed with modules, with 136 listed in the documentation and about half that are in normal use. To the author this, there was a bad code smell: a giant request_rec structure is passed around to each module. if there was a difference between the understanding of two modules on this, it'd be bad. This is what the research is about.
The structure contains a field called filename to represent the filesystem path. However, some of the modules treat this as a full URL, which can lead to security issues. This can be used to truncate entries using a ? in the path. For instance, mod_rewrite allows sysadmins to easily rewrite a path pattern with the RewriteRule directive. By providing a question mark here, the rewritten path will be truncated, resulting in a bad access. Another example of the truncation being useful is with a RewriteRule on the path.
The other interesting issue with the filename confusion is an ACL bypass. It's common to use the File directive to add authentication to a file access. Using the confusion on the file path with the URL encoded question mark, we can get one path verified but another actually used. For instance, admin.php%3Fooo.php would be verified by the ooo.php at the end but used with admin.php.
The next bug is crazy. When Httpd is processing a request, it first looks at that exact spot on the file system with specific rewrite rules. Then, it attempts to go to the specified document root. Most of the time, the root directory isn't there so it doesn't matter though. This means that if the prefix of a RewriteRule is controllable then the entire file system can be accessed!
Well, sorta. Because of the rewrite rule having an ending attached to it (like .html), we can only access what this allows. Additionally, Apache has a built in protection for protecting against the access of some files. Using the first primitive allows us to truncate the path though, creating a super primitive. Using this bug, the author found they could disclose arbitrary source code.
Even though there are restrictions on where can be accessed by default, we can use gadgets. The LibreOffice file at /usr/share/libreoffice/help/help.html contains an XSS. Some libraries, such as Wordpress plugins, could be used for LFI via tutorials. They mention a few other ways to exploit this, including abusing symbolic links.
In Apache, there are two directives that do the same thing: AddHandler and AddType. Under the hood, there is some magic from 1996 to allow for both to be used by using the content_type field as the module handler when the handler field is empty. This new primitive is the ability to overwrite the function handler.
The first instance of this being exploited was in mod security. When an error occurred in processing of a path, it wasn't being handled correctly by the Content-Type was being overwritten. As a result, the wrong handler was being executed, resulting in source code for PHP instead of the result of PHP being returned. This technique could be used in conjunction with other content type changes as well.
Next, if an attacker can control the Content-Type header in the response then we can invoke ANY handler. Even though this processing happens after receiving, server side redirect make this exploitable to hit any CGI implementation on the server. The author mentions an SSRF with controlled headers or CRLF injection as potential ways to do this.
How does this become exploitable? Getting an image file to be processed as a PHP script can quickly lead to RCE. mod_proxy leads to a full SSRF or direct access to unix sockets. Finally, they found that PEAR.php included with Docker can be used to get RCE by using PHP even.
At the end of the article, they say this is promising for more research. The author only focused on issues in a few impactful fields but there may be other fields that cause as much havoc. The more complex a code base is the more unique vulnerabilities are likely lurking there. Amazing research, as always by Orange Tsai :)

Maxwell Dulin

About Project Blog Resources

Resources People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Confusion Attacks: Exploiting Hidden Semantic Ambiguity in Apache HTTP Server! - 1467

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!