Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

SSRF and RCE Through Remote Class Loading in Batik- 1000

Piotr Bazydlo - Zero Day Initiative (ZDI)Posted 3 Years Ago
  • Apache Batik is a library used for parsing Scalable Vector Graphics (SVG) and transforming them into other formats. Even crazier, the documentation mentioned executing JavaScript, loading and executing Java classes and many other things. This felt like a SSRF goldmine to the attacker, with several previous vulnerabilities indicating this.
  • A common use case is taking in an image or URL to an image to transform it using Apache Batik. The tool has many built in protections in place for scripting and other things. For ScriptSecurity, there are several different settings from no scripts to allowing scripts from loading remotely.
  • In the security controls, there is the concept of origin within a URI. In particular, local SVG files can load scripts but not remote scripts. If we can bypass this control, we can do some horrible things!
  • The parsing to ensure this works properly had a bug in it though. First, the script URL and the document URL get the host removed from it. Next, there is a check to see if the two hosts are the same. The getHost uses the standard Java function, which is known to behave strangely with non-HTTP protocols.
  • The host for a local file (file:///some_file.txt) will always return NULL. Things like an external file and HTTP will properly return the host, making the check succeed. However, jar or Java Archives will also return NULL! Since the domains are now the same, the security protections no longer work as intended.
  • The obvious attack vector is SSRF, but we can do more. Apache Batik supports remote class loading for with Java bindings. By including a remote class in our SVG to execute a JAR file, we can execute some Java code. Using this, it is pretty trivial to execute arbitrary code on the system.
  • An additional way, if remote JAR loading is not allowed but scripts are, is abusing the ECMAScript engine. In particular, accessing the Java runtime from ECMAScript gets trivial code execution, by design. The official security guide for the ECMAScript standard is securing the application with a Java Security manager which is probably never used.
  • Overall, parsing differentials are absolutely fascinating! It's super interesting seeing how this default and unexpected mechanism in Java caused such a big problem. However, the capabilities of these Apache products are just too powerful.