Orange Tsai (of course) found a vulnerability within PHP. In particular, they found an issue that affects XAMPP (a popular way for admins to deploy PHP apps) to get RCE.
The original post did not have many details about the bug itself. So, the author of this post started to dig into the issue. They noticed that it only affected CGI mode of PHP. When doing this, it parses HTTP requests and passes them to a PHP script to do processing. For instance, http://host/cgi.php?foo=bar turns into php.exe cgi.php foo=bar.
Naturally, you would think this is an obvious command injection vector for calling
php.exe. In fact,
CVE-2012-1823 was exactly this bug! The original bug was an issue in a URL lacks the
= character between parameters, the data wasn't being properly encoded.
What's the new bug? Of course, it's Unicode! When PHP does processing on the input parameters, it will do best fit mapping of characters. This is crazy, as mapping unicode to ASCII feels impossible. In the CGI code for preventing command injection, it will escape hyphens to prevent extra parameters from being specified. However, a soft hyphen (0xAD) does not get escaped but PHP will convert this to a regular hyphen! Hence, we can add in our own parameters.
The actual exploit is hilariously simple. Make an HTTP request with %AD (soft hyphen) to smuggle in a dash. Now, we can control arbitrary parameters to PHP. Using the -d flag to control PHP configurations. Setting auto_prepend_file=php://payload alongside the allow_url_include flag to enable PHP URLs allows us to get code execution on the server.
The normalization code from unicode to ASCII was weird to me. I've read reports about this for years but have never seen anything actually do it. Apparently,
unicode has a standard for normalization, where this is a
Python implementation as well.
The bug is exploitable on a few different locales, which is fascinating. To me, there are two main takeaways. First, old bugs are good to know; many of the attack vectors from them are there. With new progression in security techniques, more bugs in these areas may fall out. Second, Unicode to ASCII things exist. Overall, great bug!