Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Slice: SAST + LLM Interprocedural Context Extractor - 1727

Caleb Gross Posted 5 Months Ago
  • In the middle of 2025, Sean Heelan made a post describing the usage of an LLM to find a use-after-free vulnerability in the Linux Kernel that was similar to an existing bug. In the post, Sean explains how the LLM found the bug and the limitations of LLMs on security reviews at the moment.
  • The goal of the author was to rediscover this vulnerability consistently using LLM tools without prior knowledge of the codebase. They wanted to reach the code without needing to build/compile anything. Luckily for them, CodeQL had just launched a system that could analyze code without the need for building the code. Additionally, they didn't want to use an agenetic framework.
  • Initially, they used a simple cross-function UAF detector in CodeQL that generated 1700 possible UAFs. With this, they used the tree sitter tool to improve the information as a plugin to CodeQL. Additionally, this allowed for a maximum call depth that limited the findings to 217. They pointed their code at the entire /fs/smb/server/ implementation of the Linux kernel. This is 26.4K lines of code, for context.
  • With these 217 findings, they came up with a plan. First, triage with a smaller model. This would define the "use" from the downstream free() to avoid the larger model token burning. The second step, analyze would ask more profound questions about how the free memory might be accessed, exploitability, and potential fixes for the code. To do this, they created a tool called Slice that can take SAST input and feed it to an LLM for usage.
  • Between both of the run types, it costs $1.75 for triage and $1.64 for analysis in a total of 7 minutes. By running this on GPT-5, it found the vulnerability 10 out of 10 times! They found the bug they were looking for but don't mention the amount of false positives that came along with it. From reading the JSON output, it may have been zero!
  • The usage of LLMs is going to revolutionize many things. By providing the LLM with the proper data and a specific task (finding UAFs in SMB Linux server code), it performed pretty well. In the context of bug bounty, where you want to find a bug and don't need to find every bug, this seems very useful to me. This tool appears fantastic, and it's something I'll likely use in the future for my own needs.