About Project Blog Resources

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Comprehension Debt: The Ticking Time Bomb of LLM-Generated Code - 1745

codemanship Reference →Posted 5 Months Ago

Before making a change to legacy code, you must understand the code. This often requires understanding why it does the things it does which may not be obvious 10+ years after code was written. Even the code written for this website by a single individual (me) 7 years ago, this can take a long time.
With LLMs, teams are writing code at an unprecedented rate. Some of them will review the code and make changes to what the LLM did, offsetting some of the downstream effort.
Others have gone with a different approach though. Some folks are checking in code that somebody else has read and barely tested. This means that teams are producing code faster than they can understand it - the author coins this "comprehension debt". if the software gets used then the odds are high that at some point the generated code will need to change.
The comprehension debt is the extra time it's going to take us to understand the code. Of course, if you're trying to understand somebody else's code, you already had to do this. For your own code, this will slow you down though. Not even LLMs can save you from the ever-growing mountain of "comprehension debt" in many companies.

SP1 and zkVMs: A Security Auditor's Guide- 1744

Kirk Baird - Sigma Prime Reference →Posted 5 Months Ago

SP1 is a zero-knowledge virtual machine (zkVM) that enables developers to prove the execution of arbitrary programs that can be compiled to RISC-V. Most of the code that uses this is written in Rust though. The ZK circuits enable devs to write standard Rust code to generate their cryptographic proofs, instead of domain-specific languages. The goal of this post is to prime security auditors to review code that uses SP1.
The SPL architecture is as follows:
1. Compile the code into a RISC-V ELF binary.
2. Execute the program in a zkVM. This will generate the STARK proof to be used later.
3. Optimize and verify the proof. This is the mathematical verification that the code ran as intended.
The system consists of two components: the prover and the verifier. The prover executes the guest program and generates the ZK proof. The verifier takes in the proof and validates the cryptographic assumptions of it. This should come from the prover but a malicious actor can submit whatever they want. If the verification succeeds, the claimed computation has occurred.
The system is separated into Host and Guest systems. The Host is the standard machine that executes code, such as the machine you're using to view this website. The Guest program runs inside a VM that is completely separate from everything else. No Internet access, no databases, no nothing. When reading the code, the host and guest code is somewhat intertwined, making it an important distinction.
The first security note is that all input data is untrusted. If input is coming from the HOST to the GUEST, then the inputs must be validated. Range checks, length checks, business logic constraints, etc. should all be done. On this note, only GUEST Code is proven - not code running on the HOST. So, if there's a check in the HOST that's not in the GUEST, you probably have a bug.
SP1 uses 32-bit RISC-V. When coming or using 64-bit systems, this can cause issues. For instance, integer truncations and overflows should always be checked if dealing with usize values. On top of this integration issue, many dependencies attempting to be added to SP1 compiled code were not meant to be. This can lead to similar types of integer issues, operating system calls, unsafe code, and many other weird quirks.
When using SP1, data can be committed to become a public output. Naturally, if we're doing zero-knowledge proofs, the public information should be carefully audited. For instance, disclosing someone's age would be inappropriate. Another issue that is weird to me is Verification Key Management. In SP1, each program generates two keys: one for the prover and another for the verifier. Each guest program must have a unique verification key derived from its binary and not allow older key versions.
There are cases where information cannot be computed within the proof but rather statically as part of the output. For instance, a merkle proof can be generated. The validity is determined based upon the block hash associated with it. So, the block hash must be validated separately from the program. For SP1, you would want to make this a committed value as an output for external validation.
The most common vulnerability is around "Underconstrained circuits". This is simply the insufficient validation of state transitions in a program. This is basic logic validation like most other things. According to the post, practical knowledge of STARKs/SNARKs isn't necessary for auditing SP1 programs, unlike other cryptographic primitives.
A solid introduction to reviewing SP1 programs. I feel like this demystified a lot of terminology as well, which I really appreciated.

Hacking with AI SASTs: An overview of 'AI Security Engineers' / 'LLM Security Scanners' for Penetration Testers and Security Teams - 1743

Joshua Rogers Reference →Posted 5 Months Ago

The author of this post was curious about the various AI-native security scanners. They wanted to find a product on the market that could identify vulnerabilities in code during a code review today. So, they tried numerous products, learned how they worked, and came up with this blog post. Surprisingly, AI security auditors are advertised everywhere but can actually be found nowhere.
All of the products tested had a very similar set of offerings. Full code, branch, PR/R scans. ZeroPath has a SOC2 report generator. Some of them have hooks for things like GitHub actions, bot guidance to developers, response to PRs and IDE plugins, naturally. Finally, they all support auto-fix/remediation guidance as well.
The first step is to ingest all the code and index it appropriately. Once it's uploaded, the context necessary for the LLM to scan and understand the code can be attempted. Extra context for the types of issues to find could be added for scans as well.
The next part is more of the "secret sauce". Asking an LLM to find all vulnerabilities won't be very helpful. So, how does it find the particular code to focus on? The tool could ask for function-by-function or file-by-file analysis. Some use CodeQL permissive queries, opengrep or any other AST traversing of the application. Once it has a candidate vuln, it will perform analysis to see if it's real or not via more detailed analysis.
The final stage involves reporting vulnerabilities, which includes detecting false positives and de-duplication. According to them, the tools didn't report as many false positives as traditional SAST tools. Some of them were better or worse at specific languages. Some were better at particular vulnerability classes.
Gecko and Amplify were very bad with no real bugs found. Almanax was very inconsistent - it would sometimes find basic bugs and other times it wouldn't. It was very good at very deliberate backdoored code though. Corgea found about 80% of purposely vulnerable code that was scanned. It had about a 50% false positive rate which isn't really that bad though. The language made a huge difference on the quality for this tool.
ZeroPath, according to the author, found 100% of the vulnerabilities in the Corpora. Additionally, it identified legitimate bugs in real-world codebases, including curl and sudo. Most of the real-world bugs weren't security issues, but bugs nonetheless. This was the best tool of the bunch.
Some takeaways:
- The biggest benefit is around surfacing inconsistencies between developer intent and the actual implementation.
- The tools were good at finding business logic issues.
- They may replace pentesters in the longterm, or at least supplement them. For things without millions of dollars on the line, they are already a good fit.
I really like the tone of the article and the perspective of seeing the AI as a helper. For instance, mentioning that while the AI does miss bugs, so do humans. The comparisons are realistic, which I appreciate. Good article!

This House is Haunted: a decade old RCE in the AION client- 1742

himazawa Reference →Posted 5 Months Ago

Massive Multiplayer Online video games are still huge. One of those, made in South Korea, is AION and is the focus of this post. In the game, a player could purchase and customize a house. The Butler, who managed your house, allowed users to write custom scripts to play sounds and automate actions. Neat!
The scripting engine under the hood is some version of Lua. It has in a sandbox with many functions stripped out. After some debugging, they were able to find out all of the available functions defined in _G.
After reviewing the list, they found several that were useful for code execution. load() and loadstring() are two easy ones. Using these functions, it's possible to load in Lua bytecode that can bypass the bytecode verifier to cause memory corruption. Luckily enough, io wasn't disabled which can be used to open arbitrary processes very easily. io.popen("calc.exe"); is enough to do this, for instance.
There are several mechanisms to make this "no-click" besides entering the house. OnInit() will run whenever somebody enters the house. Interestingly enough, this gives you code execution on the users client and not the game server. Still pretty neat!

Taming 2,500 compiler warnings with CodeQL, an OpenVPN2 case study- 1741

Trail of Bits Reference →Posted 5 Months Ago

The authors of this post were reviewing OpenVP2 when faced with a difficult challenge: it had over 2.5K compiler warnings. Could some of these be security issues though? Their goal was to limit these errors to only the ones that matter. They decided to tackle a single class of issues: numerical conversions.
C's relaxed type system allows for implicit numerical conversions. Not all conversions are security issues but some of them can be. Signedness, truncation and overflows are all issues that can arise from this. With this problem defined, they decided to build a CodeQL query to identify potentially problematic areas.
After performing all of this analysis, they determined that none of the conversions led to real issues. It's interesting to see the usage of more niche CodeQL queries to perform useful flow analysis. Good blog post!

Introducing V12 Vulnerability Hunting Engine- 1740

Zellic Reference →Posted 5 Months Ago

This blog posts delves into the results of an autonomous Solidity auditor called "V12". It has a UI and makes it easy to interact with via a website. According to them, it performs at or exceeds the level of junior auditors at some firms. It can find many basic programming mistakes, some even missed by various companies. It will integrate with C4, Zellic/Zenith audits, a standalone application and a GitHub Action.
The mission is sane - security is a continuous battle, not a commit hash in time, and products/services should reflect this. Naturally, this doesn't replace an auditing company but it can help the service team in the long term. Finding even simple issues, like access control vulnerabilities, improving the security as a whole.
I appreciate that they include an Evaluation section for bugs they have found. They show several vulnerabilities from previous hacks, such as the 1Inch bug, MonoX hack and a couple of others. The 1Inch bug is slightly deceptive - this was more-so caused by a scoping issue and actually had been found by auditors.
The tool has competed in several live Cantina/HackenProof auditing contests. I find these most impressive, since their was no "taint" potential on the model. These are unique vulnerabilities that others found in a contest.
They also list several historical contests, which could potentially be tainted in the data set. For proper evaluation, the training and test sets must be completely unique. On the other contests they list, they claim V12 found enough bugs that it would have placed well in the competition. 2 out of 2 highs and 4 out of 6 issues are highlights from this section. I'm slightly skeptical about this; was their some tainting of the training data set vs. the testing data set? If this was true then how come it didn't perform as well on live contests it posted?
They also use this on their live audits. Many of the bugs are fairly simple, such as access control issues, reentrancy and bad error handling. They even mention this themselves, which is an interesting analysis. All of these are great things that would work great in a CI setting and as an assistant to a security researcher. As LLMs get better, I think that the vulnerabilities will become harder and harder to discreetly find but also more valuable.
Their perspective on who should use the tool is wise. V12 can enhance the capabilities of a great researcher but should only be used at the end. It's more of an additional layer of assurance and a source of inspiration than anything else. To inexperienced researchers, it's mostly a crutch. I'm curious to see how this plays out.

Wrong Offset: Bypassing Signature Verification in Relay - 1739

Felix Wilhelm Reference →Posted 5 Months Ago

Replay is a cross-chain bridge on Solana. The original design had simple relayers, but the newer version introduces more smart contracts for managing funds. The idea is to transfer funds on one chain and receive the funds on another by order fulfillment via LPs.
To initiate a transfer, users must create a transfer on the source chain. On the destination chain, a TransferRequest is signed by a privileged off-chain entity known as the allocator, which releases the funds to the user.
To perform signature validation, the native ed25519 program is used and instruction introspection is performed. The program first reads the index of the current instruction and then fetches the previous instruction to perform validation of the signature. The native program contains a lot of information for the data being verified and offsets for exactly what data is being checked. When performing the validation on the instruction itself, it checks that the program is correct and that the signature count is one.
The arbitrary offsets and indexes are a powerful feature of the Solana Ed25519 program. The offsets for validation are hardcoded into the relay bridge program, though. In practice, this means that we can specify the proper public key at the hardcoded offset, but then perform the validation at a different offset! By doing this, data can be signed with a different key but still be viewed as valid.
The bridge didn't have very much funds at risk. Additionally, since this is a solver protocol and not an actual bridge, only in-flight funds were in the bridge at the time. Another great find by Felix in a major footgun for the Solana ecosystem.

FortMajeure: Authentication Bypass in FortiWeb (CVE-2025-52970)- 1738

https://pwner.gg Reference →Posted 5 Months Ago

The Session Cookie on this website contains three parts:
- Era: Type of session.
- Payload: Encrypted data with session information, such as the username.
- AuthHash: SHA1 HMAC hash for the ciphertext of Payload above. This uses the same secret key as the Payload for encryption.
This C server selects the shared key based upon the Era value from an array. It decrypts the Payload using the key. Then, it verifies the AuthHash using the key and ciphertext. Pretty simple!
The vulnerability lies in the use of the Era value, which should only be 0 or 1. However, there is no check on the value, which leads to an out-of-bounds access. Since this is used for direct access to the key, this is a significant issue. When the Era is 2-9, it will read uninitialized memory! This removes all entropy from the key space, which allows us to encrypt and sign the data ourselves.
To run this attack, the target user must have an active session running. Still, it's a pretty sick bug! It's not very often that a memory corruption bug leads to a cryptographic bypass. I believe that as binary exploitation presentation methods become more sophisticated, application-level attack methods will also become more prevalent.

Our plan for a more secure npm supply chain- 1737

Xavier René-Corail - GitHub Reference →Posted 5 Months Ago

Recent attacks on the NPM ecosystem have scared the security industry. A simple compromised package is enough to infiltrate all Fortune 500 companies at once. Several high-profile hacks of NPM maintainer accounts have led to the addition of post-install scripts to steal secrets. GitHub removed the compromised packages and blocks the addition of new packages with the attacks' IoCs.
So, what's the next step then? We cannot let this continue happening. GitHub has decided to force 2FA for local publishing of NPM packages. Additionally, many aspects related to tokens will be updated. The deprecation of classic tokens, TOTP 2FA will be deprecated, granular tokens will contain scoped permissions and have short expirations, and the 2FA option for local package publishing will no longer be bypassable.
The next step is to make Trusted Publishing a more prominent feature, as described here. Instead of using tokens within the build pipeline, short-lived and tightly scoped OIDC identity tokens will be utilized. PyPI pioneered this approach and has since been adopted by many other ecosystems.
Trusted Publishers ensures that a package is coming from a specific CI system, workflow and build pipeline, limiting the ability to publish to the package manager arbitrarily. This allows package repositories to function even with systems that have decentralized build pipelines. There is also the prevention of "Star Jacking" attacks, which can confuse users about the trustworthiness of a project.
A good callout from GitHub on how to secure these ecosystems. I love this article and it shows their proactive nature.

Threat Contained: marginfi Flash Loan Vulnerability - 1736

Felix Wilhelm Reference →Posted 5 Months Ago

Flash loans, or the borrowing of a large amount of money within a single transaction, work because they must be repaid with interest by the end of the transaction. In EVM, an external call from the context of the smart contract can be made with a callback to a user-controlled contract.
In Solana, Cross-Program Invocations (CPI) make this too hard to do. Therefore, instruction introspection is used to ensure that the flash loan is indeed paid by examining the instructions that are supposed to be executed in the future. The instruction sysvar account makes this possible.
MarginFi is a lending platform on Solana but it adds some extra flexibility to this. The account MarginfiAccount is used to track users' assets and liabilities. It must remain healthy at all times except in the case that a flash loan has been created for it. In this case, the health check is skipped, assuming that everything will be resolved by the end of the call.
When creating a flashloan, the lending_account_start_flashloan function will ensure that there's a following call to lending_account_end_flashloan. Using this call, a health check is performed to ensure that the funds from the flash loan have been returned.
Recently, a new instruction called transfer_to_new_account was created. This is for migrating the original MarginfiAccount to a fresh account and empties the original one. This call fails to ensure that we're not in the middle of a flash loan though!
This leads to the following attack:
1. Create an account A with a flash loan.
2. Borrow the maximum amount of funds for the flash loan.
3. Call transfer_to_new_account to move the outstanding liabilities from A to a new account B.
4. End the loan on account A. Since the liabilities are now good and the call to lending_account_end_flashloan is made, this is sufficient.
5. End the call and never repay the funds. Now, account B contains a flash loan account that has never been paid back.
A solid find from Felix! A good lesson for me is that even small changes in functionality can have horrible consequences. The more functionality you have, the more these can interact and cause havoc with each other.