About Project Blog Resources

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Claude Skill for Solidity Smart Contract Vulnerabilities- 1895

kadenzipfel Reference →Posted 1 Month Ago

The repository contains a set of Claude Skills for Solidity smart contract vulnerabilities. They range from authorization on tx.origin to more nuanced/contextualized things like access control checks. Many of these are already found via Slither but it's the new ones that are interesting to me. Cheatsheet

Improving UserOperation Execution Safety in EntryPoint v0.9- 1894

ERC4337 Reference →Posted 1 Month Ago

The ERC4337 (Account Abstraction) implementation assumes that UserOperation binds the protocol to run the user's transaction only by the intended user. In particular, being sent directly to the contract on the blockchain from an EOA. In reality, the transaction does NOT have to be run in an isolated context.
Reentrancy guards and flash loans are great examples of this. The state of an executing contract can be modified prior to execution of the UserOperation. In both cases, it would be possible to force the transaction to fail by triggering the reentrancy guard. This would grief users for the gas they spent.
These can be observed by looking at the public transaction mempool or the gossip-offchain ERC4337-specific mempool. Both are valid ways to front-run these calls and are perfectly valid.
Operations like simple transfers on UserOperations are not affected. More complex contracts, such as flash loans and those with reentrancy guards, would have been affected. The discoverers of the vulnerability from TrustSecurity received a $50K bounty. This is at the top of the high category in the program. It was a unique issue identified through a deep understanding of the ERC's context. Good report!

Starknet Incident Report – January 5, 2026- 1893

Starknet Reference →Posted 1 Month Ago

Starknet is an L1 that utilizes a ZK prover. The blockifier is the creator of the blocks and proofs. I imagine that they have a centralized sequencer, but I'm not sure. Recently, they experienced an outage during which new blocks could not be built.
The Blockifier had an issue with a complicated set of contract calls:
1. F1 calls F2 and F2 calls F3, where F2 and F3 are the same contract.
2. F3 changes a variable in storage.
3. After F3 has finished, F2 changes the same variable from step 2.
4. F2 panics. F1 catches the revert and continues execution.
In this case, the value of the variable should have been the original value prior to calling F2 at all. In reality, the value from F3 was used! Since this is just block production, the ZK prover still got it right. So, no weird writes after reverts in the execution layer. Even though this led to an outage, it's cool to see the prover to its job.
To ensure this doesn't happen again, they have some new initiatives internally. They are re-architecting the prover-compliant execution to run directly after transaction execution. If they don't match, then an auto-halt will occur. Although a crash is bad, it's better than a deep reorg. They will add better fuzzing. In reality, I doubt fuzzing would have found this bug though.

Ticket Tricking OpenSSL.org with Google Groups- 1892

Space Raccoon Reference →Posted 1 Month Ago

Ticket Tricking is a technique to get OTPs or verification emails sent to a public forum so that you can "prove" you have access to a domain when you really don't. Google Groups have this risk and are the focus of this post.
The author of the post found a tool for scraping Google Groups. Unfortunately, it was somewhat outdated and only looked for a single hard-coded group. So, they wrote a Vibe-Coded application to find Google Group URLs, filter them, and check for public read access. After scanning from 32K raw URLs, they were left with 150+ groups.
One of the vulnerable instances was OpenSSL.org Slack group. The author logged in to the group using the OTP leaked on the forum. The end result is that there are serious implications to this. Many applications (except Slack) have patched vulnerable-by-default mechanisms. However, GitHub email verification, auto-join SaaS tenants and many other things are still vulnerable. Good post!

Architecting Security for Agentic Capabilities in Chrome - 1891

Nathan Parker - Chrome Security Reference →Posted 1 Month Ago

Agentic browsing appears to be the future of Chrome and other web browsers. Unlike other types of attacks, prompt injection is not something that can be fully "solved" in the traditional sense. This article details how the Chrome browser is attempting to prevent indirect prompt injection from hijacking the user's browser. After reviewing built-in protections from Gemini and other agent security principles, they are adding a new feature called user alignment critic and better origin isolation.
The main planning model in Gemini uses page content in Chrome to determine the next action. Naturally, this is a great place for prompt injection because it may contain attacker-controlled content. They use spotlighting and train Gemini against attacks, but this still isn't enough.
The user alignment critic is a separate model that evaluates the output of each action. Notably, it must serve the user's end goal. So, if the user is trying to view a store's address and the planning model attempts to initiate a bank transfer, that will obviously be rejected. The critic model is only allowed to see the metadata of the result and not have any unfiltered content. In practice, this makes the critic module immune to prompt injection. This helps prevent both goal hijacking and data exfiltration.
The next protection is around site isolation. Agents can operate across websites, which violates this key principle. So, a prompt injection from site A could compromise site B. To address this, they are adding Agent Origin Sets, which limit the domains an Agent can access to those strictly required for the task.
For each task, there is a gating function that is used to decide whether domains by the planner are relevant to the task or not. The design has two types: read-only origins and read-write origins. As with the alignment critic, the gating functions are not exposed to prompt-injection risks. Users can add origins as needed to complete the task as well.
Part of the security belongs to the user. If you give a bot access to your bank and they steal your money, that's on you. The origins being used still need to be verified by the user. Some domains require explicit approval, such as banks and Google Password Manager, while others only require permission for the gating functions.
On the reactive side, they have realtime scanning of pages to detect prompt injection attacks. There's an additional classifier that detects prompt injections and will reject the page if it's usable. They even have persistent red-team bots that try to derail the agentic browser.
This article is great and echoes a great principle: design with security in mind. By having site isolation and the built-in critic alignment checker, derailing the Agent to perform malicious actions will be much harder. Great post!

A White Mage’s Guide to Web3 Bug Hunting- 1890

WhiteHatMage Reference →Posted 1 Month Ago

WhiteHatMage was in the top 3 on both Immunefi and HackenProof for web3 bug bounties last year. This post explains how they identify projects and the realities of finding vulnerabilities in live projects with impact.
What makes vulnerabilities more likely? First, bugs hide within complexity. Most serious issues they find are simple mistakes, but in layered/complicated systems. Many fixes are just one-line changes. Next, innovation creates space for new bugs. When projects adopt a new approach, they are unlikely to consider attack paths correctly. Implementation-level innovation matters too. With new implementation experiments come subtle bugs. As ecosystems mature, there are fewer bugs. New chains, uncommon languages, etc. tend to have more basic issues simply because fewer people have looked at them.
Optimizations are at the root of a lot of evil. Heavy assembly usage, manual memory management, rewritten math expressions... Optimizations often obscure edge cases that developers did not anticipate. Next, code quality tells a story. When developers rush a feature or lack attention to detail, bugs are more common. Poor code is hard to secure. Non-functional issues, such as sloppy comments, are even noteworthy. Ignoring best practices, missing of basic security patterns like CEI are all warning signs. Projects with poor code quality are high-risk for vulnerabilities.
Audit reports are a useful context. Multiple critical findings are a serious red flag. Fixing every issue introduces the risk of introducing another issue or applying an improper patch. Depending on the quality of the codebase and the auditing company, they look for different types of issues.
- Good codebase with good audits. Novel or very complex bugs.
- Good codebase with average audits. Complex paths and known security pitfalls.
- Average codebase with good audits. Review audit fixes and leftover weak design.
- Average codebase with average audits. Missed but not extremely complex exploit paths.
Being first is super important. Right after a big launch, there's a large chance of vulnerabilities. So, the author will often speedrun basic security checks on a project if they hear about a launch. In the first few weeks, more complex attack paths may be discovered that auditors didn't identify. When a project first gets the bug bounty program, the competition is intense. So, they tend to check early and then come back for a deeper pass later.
The approach for finding critical bugs is very similar to my process. They focus only on critical paths; they ask themselves which invariants must hold for this to be secure. Over time, this builds intuition. After a while, they come back to a codebase. This is beneficial because they may have new techniques or the system may have changed. It's important to note that time is limited, so every decision matters.
They add a list of bug bounty archetypes:
- The Digger. Goes super deep into a single program.
- The Differ. Compares one mechanism across many different projects.
- The Speedrunner. Reviews new programs quickly.
- The Watchman. Monitors deployments and upgrades.
- The Lead Hunter. Develops ideas around lesser-known attack vectors.
- The Scavenger. Inspired by obscure writeups or little-known incidents.
- The Scientist. Builds major tooling for analysis.
When choosing a bounty program, they also consider the project's reputation itself. If they are well-known for lowballing or not paying, it's not worth your time. Do they have the money to pay you in the first place? Are the rewards and scope clear? Do they take security seriously via audits, or is it just a checkbox? Once they find a single bug then report it and see how the process goes. Only after this do they look for more. For them, red flags are vague rules, low caps, prior disputes, or a lack of response.
A fantastic article from a fantastic security researcher. Thanks for taking the time to write up!

What AI Security Research Looks Like When It Works- 1889

Stanislav Fort Reference →Posted 1 Month Ago

Aisle, the company blog authoring the post, is an AI security tool. Recently, Antrophic reported finding 500 vulnerabilities across various products. This has a problem, though: they don't discuss the severity breakdown, target selection, or maintainer response at all. At Aisle, they test ONLY against the most secure software projects with no retrospective comparisons.
The Aisle tool recently found twelve new vulnerabilities in OpenSSL. One of these was a buffer overflowin the CMS message parsing that could have been remotely exploitable without valid key material, with a rating of 9.8 out of 10. In five of the twelve cases, the AI system even proposed the fix.
Daniel Stenburg, the creator of curl, recently closed their bug bounty program due to LLM spam. They noted that AI can be effective for open-source security when used responsibly. It's an interesting perspective, given his history with the slop on his own bug bounty program. Aisle previously identified three vulnerabilities in curl, which were reported and fixed.
A great quote: "There's a temptation in this space to lead with big numbers. Five hundred vulnerabilities sounds impressive. But the number that actually matters is how many of those findings made the software more secure." The failure mode is now drowning maintainers in noise and declaring victory rather than actually improving the security posture. AI is collapsing the median via slop and raising the ceiling; it just depends on what side you're on.
Aisle has a PR review tool that appears to routinely find bugs. Daniel Stenburg even uses it on his own pull requests. They found a buffer overflow in a curl PR recently, as well as two UAFs in OpenSSL changes. The goal is to prevent vulnerabilities before they can occur. Good report on what good AI security looks like!

Median time-past as endpoint for lock-time calculations (BIPs 113)- 1888

BIPS Reference →Posted 1 Month Ago

Bitcoin transactions can include a lock time, meaning they cannot be mined until that time or block. Bitcoin blocks are not required to have increasing time intervals, for some reason.
This creates bad incentives: miners want to make money, and the user doesn't want their transaction included yet. By creating a block with a timestamp way in the future, it's possible to get other clients to accept the transaction as valid! This is regardless of the wall time.
The BIP proposal suggests using the median time of the last 11 blocks to determine whether a transaction can be spent. Since it's the median, there's no averaging issue here to deal with. This is important for the CHECKLOCKTIMEVERIFY opcode. Overall, a good fix to a bizarre issue affecting Bitcoin.

A successful DOUBLE SPEND US$10000 against OKPAY in 2013- 1887

Bitcoin Forums Reference →Posted 1 Month Ago

In March of 2013, an unexpected Bitcoin fork occurred, as documented in BIP 50. This was because a block with many transactions was mined. Bitcoin 0.8 nodes could process it, but pre-0.8 nodes could not. This caused a fork because pre-0.8 Bitcoin nodes accounted for about 60% of the mining power.
When switching to version 0.8, the upgrade now uses LevelDB instead of BerkeleyDB. BerkeleyDB had a limit on the number of transactions that could be in a block due to DB locks; this unintentionally became the new rule on the network. This limitation was removed on BerkeleyDB.
A user deposited $10K in BTC to OKPAY, which was included in the 0.8 fork. After some analysis, they realized that the TX was never confirmed on the 0.7 fork. They then created two transactions from the OKPAY transaction and broadcast them on the pre-0.8 fork block.
It's a double spend because one fork was actually being used by the payment provider, who used a different one. In reality, once the fork was detected, the payment processor should have stopped accepting Bitcoin transactions until the issue was resolved. Overall, a really interesting case of a double spend leading to stolen funds.

On the clock: Escaping VMware Workstation at Pwn2Own Berlin 2025 - 1886

synacktiv Reference →Posted 1 Month Ago

The authors competed at Pwn2Own Berlin 2025 in the VMWare Workstation category. The vulnerability exists within the PVSCSI (Paravirtualized SCSI) controller emulation code. This is responsible for handling SCSI commands and forwarding them to the proper device on the machine. The guest OS splits the data into variable-sized chunks, each specifying a guest physical address to use.
The code copies entries via the guest driver into an internal array, then compacts it by combining nearby entries. To begin with, it has 512 segments, totaling 0x2000 bytes. If there are more than 512 entries, it allocates a 0x4000 buffer to store all entries and reallocates it for each newly added entry. The intended design is to double the size of the internal buffer when it needs to grow. The vulnerability is that the buffer allocation is statically set to 0x4000 instead of doubling each time. This leads to a very large out-of-bounds heap write. With more than 1024 entries, it's an OOB write every time.
The Windows 11 Low Fragmentation Heap (LFH) is where this chunk is placed. Typically, the strategy is to target different size classes to shift allocation to a less hardened allocator, but that's not possible here. Notably, the LFH heap has strict checks on chunk metadata and shuffles around allocations.
To exploit this vulnerability, they will need to find an object of size 0x4000 that can be directly allocated from the guest. They ended up using shaders to spray the heap since they can be freed, kept alive, or created at will. The URB objects have a length value on them that is used for writing to host memory directly. This makes them a great primitive for memory corruption.
To exploit this, it required a great deal of knowledge about the heap algorithm. They first filled two buckets of 16 each of shaders. After this, they freed all but one bucket in B1 to create a hole and allocated 15 URBs around it. Finally, a hole is created in B2 and we're ready for the exploit. The allocator will bounce between the two available slots in the two buckets. We use B2 to eat the bad write so that we don't corrupt the metadata of a heap chunk on another object. B1 has an object that can now be corrupted safely. This circumvents the mitigations and allows for the corrupting of OOB chunks on the heap.
This bug can be used to leak ASLR. Once ASLR is leaked, a fake URB structure can be created to cause havoc. For an arbitrary read, overwrite the URB.data_ptr. For an arbitrary write, corrupt URB.pipe and use a writeback mechanism to write those bytes. From there, they corrupted a callback function on a USB pipe object structure to call WinExec() because it's a CFG-whitelisted gadget.
The exploit was unreliable because it assumed knowledge of the heap at startup. They used some tricks to make the exploit more predictable and reliable. Their strategy was that creating a new bucket should take longer. They used this as a time-side channel to understand the current LFH state. Luckily for them, it worked first try during the contest.
They conducted this research over three months, evenings, and weekends. The first month was spent on reverse engineering and identifying the vulnerability. The exploitation took two months to do because of the LFH mitigations. Overall, a good post on the discovery and exploitation to win some money at Pwn2Own!