About Project Blog Resources

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

High Risk Bug Disclosure: Across Bridge Double-Spend- 1415

iosiro - Jason Matthyser Reference →Posted 1 Year Ago

Across protocol allows users to bridge funds between various EVM chains very fast - faster than finality. There are a couple of main users. First, the relayer who has funds on all chains. Second, the data worker for slow relays who always has enough liquidity.
The relayers search for a given transaction to have occurred within the EVM smart contract. If it's profitable and they have the liquidity on the other chain, they will do a transfer for them. If it's going too slow (aka it's not profitable) then the user can increase the fee. If it's profitable but the relayer doesn't have the funds then they call fillRelay() and the dataworker will handle it.
There are two types of events being used: deposit and fill. The deposit is what the relayer does and the fill is what the dataworker does, after something has happened to the initial fill. Being able to tie a fill to a deposit is important to ensuring that double spends don't occur - both for the on chain and off chain infrastructure.
There needs to be some fairly complex logic for ensuring that two deposits are not made to the user. Onchain, to prevent this, a hash of the deposit is made in order to track it. Offchain, the function validateFillForDeposit Fill() filters all recent fills to find the proper deposit for it.
The goal is to trick either the dataworker or the relayer to process the event when it should not. Within the relayer code, the function getValidUnfilledAmountForDDeposit() obtains the previous fills for the deposit against depositsWithBlockNumbers(). Additionally, there is a function that handles updates() that were being made to the transfer.
The relayerFeePct field would be updated for a sped up deposit within the local object. Since the hash of the original object and the new object were different, it saw that as a valid fill! The tying together portion of the code has been broken.
To exploit this, the following steps need to be done:
1. Perform a transfer from chain A to chain B.
2. Trigger a slow relay manually. This is to A) get the transfer in a different state and B) get the relayers to stop looking at it.
3. Update the relayerFeePct on the source chain. This will get the relayer to see the deposit to NOT see it as a slow relay anymore.
4. Transfer from relayer and dataworker is made. The hash is different than the original on the chain for both TXs. So, we steal funds!
To fix the issue, the client side properties now checked to ensure that clients could see the deposits that had been filled or not, regardless of the state of relayerFeePct. Personally, I don't like the client side fix very much; I feel like doing something with the hash would make more sense. Unfortunately, there are times where hashing too many things is just as bad as hashing too few.

The Graph Rounding Error Bugfix Review- 1414

@GregadETH - Immunefi Reference →Posted 1 Year Ago

The Graph is a decentralized indexing protocol. Developers can access and query data across different blockchain using web2 APIs. Many projects, use this for UIs but also for backend services. It falls into the blockchain infrastructure category. To pay for using the service, there is a token. This is where the TWO bugs are at.
A subgraph API is the curator of creator of the content. To have this be created, a user needs to stake tokens. Alongside this, they pay a 1% curation tax in GRT tokens to the platform. When calling mint() the amount of tokens paid is rounded down. If the tax was 1%, then sending in 99 tokens would result in 0 tokens, instead of 1, being sent in.
To me, this feels fairly insignificant because the cost of gas would be much higher. However, the article claims that since this was deployed on Avalanche (low gas cost EVM chain). To be honest, this felt sorta hand-wavy but I'll live with it. By batching 99 tokens per call in a contract, the cost is cheaper than the cost of gas. This steals revenue from the protocol, which is bad.
When a node operator for the Graph wants to provide services to earn rewards, they stake their GRT tokens for some period of time. When unstaking, there is a thawing period in order to ensure that bad indexers are penalized for their actions.
When calling unstake() there's a calculation error that allows a user to bypass the lock duration. There's a weighted average function for unstaking. It takes into consideration the total unstaked tokens and the amount of tokens newly being unstaked. Given this information, it will return a time where the tokens can be unstaked.
This function is vulnerable to a rounding issue when the currentUnstakedTokens is 201600 larger than newUnstakedTokens. When this happens, the newLockedUntil function will return the previous time! Using the same strategy as before, an attacker can unstake small batches of tokens at a time to avoid the locking period.
Rounding bugs are very unintuitive issues for me. The first one makes sense but required a specific blockchain to make viable. The second one I stared at the function for a while and played with some numbers until I understood it. It seems that with small numbers, the rounding is bad!
I had two big takeaways. First, if the supported chain for the protocol is on a cheap fee blockchain, then the small rounding errors become a bigger deal. Second, is a heuristic for checking these in my head. Initially, checking if the rounding direction if good or bad for the protocol. If it's bad, then review the impact then play with the numbers some to understand the impact. Good review!

Sonne Hack- 1413

Daniel Van Fange Reference →Posted 1 Year Ago

Compound and AAVE both have a bug that allows the entire protocol to be drained IF there's empty market open. Apparently, this has destroyed a large amount of forks.
Sonne was aware of this issue and had a mitigation strategy. First, add a timelock to add a market. Second, adds the funds. Finally, have the timelock open up the market for use. If followed in this order, it would be totally fine.
Sonne queued all of the multisig operations as seperate operations in the timelock. Since there was no order that had to be followed, this was a problem. Anybody could come execute these in any order they wanted.
The attacker executed the TWO timelock operations without adding funds in the middle of it. With this, the Compound/AAVE bug could be exploited once again, as before.
What should have been done better? Governance actions that must happen in a certain order must have restrictions on the ordering. For Open Zeppelin's timelock, scheduleBatch() can be used. Overall, interesting hack for 20M!

CVE-2024-21115: An Oracle VirtualBox LPE Used to Win Pwn2Own - 1412

Cody Gallagher - ZDI Reference →Posted 1 Year Ago

Within the VGAState struct of VirtualBox there is a bitmap used for tracking dirty pages of a VRAM buffer. This bitmap is large enough to use the maximum vram allowed by vbox at 256MB. When clearing the dirty bits, the start_addr is incorrectly multiplied by 4! If the address is larger than 64MB, then leads to an out of bounds access.
What primitive does this give us? A heap based bit clear. Doesn't seem like a lot and seems inconsistent on the location. To trigger the bug, they set a bunch of setting within the ioport communication. How is this exploitable?
Within VGAState, there is a section called CritSect. This is a critical section that can only be used by one thread at a time for in and out instructions for each devices MMIO region. The cLockers variable is effectively a locking variable to ensure that other threads don't access it at the same time. By abusing a bit clear, it could be possible to create an artificial race condition here.
Using this race condition, there is a problem though: there is a secondary check on the ownership that will crash changing the owner . The flag RTCRITSECT_FLAGS_NOP determines whether locking operations are checked at all, which controls the check above. The idea is to use the original exploitation path to change the flag BEFORE the crash happens. Then, after that, we can continue using the race condition for other things.
With the protections removed on the race condition detection, corruption is much easier to cause. The primitive used for corrupting the flag with the function vbe_ioport_write_data can be used once again to corrupt the size of the buffer cScratchRegion. With the size corrupted, it creates an easy out of bounds read and write.
After the VGAState variable is PDMPCIDEV. Since this is part of the initial allocation, it's always in the same spot! It contains several function pointers, leading to easy code execution. Even with CFG turned on it doesn't matter because we control the pointer and two of the parameters being passed in.
It's crazy how such a little bug turned into such a large impact. Awesome post on exploit development!

Hotwire CSP bypass on Github.com- 1411

joaxcar Reference →Posted 1 Year Ago

Using the drag and drop functionality with invalid data, innerHTML was being set. Johan Carlsson was approached about needing a CSP bypass on Github.com for this XSS in order to make it exploitable. Using things like autofilling credentials with a form didn't even work because the form-action was being set or css due to a strict allowlist.
Github has three different XSS protections on the UI: CSP, form-specific CSRF nonces and the session sudo mode.
Hotwire is an HTML over the wire framework to get HTML from the server side by observing what the page needs. The saw this was being used with the HTML element turbo-frame. By adding in a form with turbo-frame in it, Rails will listen for the inserted element and grab it from the backend dynamically for us. Since it was grabbed in a legit fashion, it also grabs the CSRF token.
However, this only loads the form - we still need a way to add information to it. Using turbo-streams an attacker can modify the input forms with a click anywhere on the page and submit the form. The impact of passing CSRF protections is that an attacker can call any form-based request, such as add SSH keys.
While messing around with this, they found a mechanism to remove the original two clicks. They found a piece of JavaScript that automates the clicking of an element! By passing in the function focusOrLoadElement, it's possible to force the page to click the various buttons for us.
The world of CSP bypasses is much deeper than I realized! Using the hotwire framework to turn this into something more useful and the reuse of niche JavaScript functions was also interesting. Overall, a great post with a better discussion on the bug bounty podcast.

onhashchange can be triggered cross-origin- 1410

Critical Thinking Podcast Reference →Posted 1 Year Ago

The web browser attempts to isolate all pages by default but allows some cross-domain communication. An interesting, yet new to me, method is by using the hash. This has been documented for a long time but was not something that I knew about.
The hash of a given page can be changed by something with a completely different domain. The twitter post uses window.open on the target window in order to do this. The post I linked above from WellCaffeinated does this by simply setting the frame source.
Why is this useful? Some pages do routing based upon the hash or use it in some other way. Being able to trigger this cross-domain can be used to have crazy effects. This is a short note but something that I wanted to remember for later.

Code Interoperability: The Hazards of Technological Variety- 1409

Stefan Schiller Reference →Posted 1 Year Ago

Apache Guacamole is a remote desktop gateway server. The architecture consists of a Java component with a C backend server. So, they go through a classic difference between two parsers to create serious security impact.
All communication is done via the custom Guacamole protocol, which is a generic wrapper that abstracts SSH, VNC and SSH. This contains an opcode with a length and value, then arguments after the opcode. When initially connecting to a server, the select instruction is used. Most values are taken from a database but the image type is directly controlled by a connecting attacker.
The documentation states that the LENGTH field is not the bytelength but the codepoint length for UTF8. Since UTF8 implementations differ and we have two locations parsing the characters (Java and C), there is likely to be a bug here. The article has a good descriptor on what they mean by this - Technological Variety.
To test this out, they wrote a small fuzzing harness. The fuzzer would generate random unicode symbols then have both Java and C process it. If there is a difference, then we have a problem. After some fuzzing, they ran into a difference in the length() of the object in Java compared to C. Sending in a 4 byte UTF8 character sequence was interpreted as a 2 byte sequence in Java. Why?
In Java 9, they use compact strings. So, this means that strings are either dynamically encoded as LATIN-1 or UTF-16 depending on the situation, dynamically. For instance, an 'A' is encoded as a LATIN-1 string internally but the greek beta would be encoded as UTF-16. What's weird about this is incoming data in UTF8 must be converted to UTF-16.
The byte length is determined by shifting the byte array of the coder value. If it's LATIN-1, then just one byte. If it's UTF16 then it's encoded by dividing the length by 2. For 1,2 and 3 byte sequences the logic works fine. However, there is a subtle issue when dealing with 4 byte UTF8 sequences.
In particular, the conversion turns this into a surrogate pair instead of a single codepoint! As a result, only the first part of the surrogate pair is recognized in the length, resulting in less bytes being processed than expected. The Java length() function returns the number of Unicode code units instead of code points. Weird!
To exploit this, we have to think about the parsing of it. The instruction creation step is done by the Java side then the instruction parsing is done by the C side, in this order. The blog post has some amazing graphics for understanding this, so please refer to do that. The idea is to send two GUAC_IMAGE parameters: one with four 4 byte unicode characters and the other with our payload we want to smuggle in.
The one with the four 4 byte unicode characters will be set to contain a length of 8 by the Java service. However, the C service will see each of these as a single codepoint! As a result, it will read more than the expected 4 bytes and read 8 instead. So, the second set of bytes is where we smuggle in our input. By putting a semi-colon then extra data, the command will be interpreted as a new instruction!
What do we want to smuggle in though? If we smuggle in the connect instruction, we can control the host that an attacker connects to. This can be used to leverage data, such as credentials. Or, RDP drive redirection can be enabled to leak world-readable files on the server.
Integrating between difference languages appears to be absolute hell for encoding. The post is amazing at talking about the differences between parsers and is super enjoyable for that reason. I personally don't like the text-based wire format for Guacamole, as it is prone to these types of issues. Great read!

Digging for SSRF in NextJS apps- 1408

AssetNote Reference →Posted 1 Year Ago

NextJS is an extremely popular 'static' site generator, which this website actually uses. So, finding configuration issues or straight up vulnerabilities in NextJS is awesome for bug hunting, since many things would be affected by it.
The _next/image component is a built in image optimiziation library that is enabled by default. This works by making a request to the endpoint _next/image under the hood, which implements caching. Now, the actual server makes a request to //localhost/duck.jpg with the provided URL. There is a remotePatterns configuration that restricts the protocols and hostnames set, which is commonly set to '*'.
When configured as such, this can lead to SSRF bugs with https://example.com/_next/image?url=https://localhost:2345/api/v1/x&w=256&q=75. Most of the time, this is a blind GET request but there was situations where the impact can be escalated. First, with an old version of NextJS or dangerouslyAllowSVG is set then this SSRF leads to an XSS via the image reflection on the domain. If the response doesn't have a Content-Type then the full response is leaked.
Even though NextJS is a client side library, there are many crazy server side features like Server Actions. This feature allows writing JS code that is executed on the server instead of the client side to interact with. When handling redirects within the server side code, it takes in the Host header to make the request. So, by providing the Host header, it's possible to force a localhost redirect leading to SSRF.
To exploit this, an action must be defined where the action redirects to some URL. To turn this into a full SSRF read, we can spoof the HEAD request to do some janky things.
1. Set up a server that takes requests to any path.
2. On a HEAD request return a 200 with a specific content-type to satisfy the constraints of the system.
3. On the actual GET request, return a 302 to our victim IP.
4. Data is returned on the request.
Overall, an interesting post on a bug within NextJS and a universal exploit method as well. This is the second issue found with this functionality - the first SSRF on the image stuff was found via Sam Curry.

Post-Mortem Report: Pike USDC Withdrawal Vulnerability- 1407

Neptune Mutual Reference →Posted 1 Year Ago

Pike Finance integrated with Circles cross chain USDC protocol CCTP. This works by off-chain signers sending an attestation that an event occurred once finality has been reached out chain A to the contract on chain B. There were two vulnerabilities in this case.
The first issue was a lack of input validation on CCTP for the intended receiver and the amount. Working on many cross-chain protocols, there is information that is set by the users that is application specific, meaning it must be specified by the integrator.
I couldn't find any more details on what went wrong besides the information above. Nobody pointed to the contract and said what was actually wrong with it and I didn't see the source code. This appears to be the exploit transaction on Optimism. I see USDC being moved around but can't find source for the Beta protocol so it's hard to tell.
Unluckily enough, there is a second issue. While trying to deploy a patch for the first issue, the storage layout got messed up. As a result, what was the initialized value was overwritten with a zero. Hence, an attacker was able to call this themselves to become the admin of the protocol. With this, they could call admin functions to drain all of the funds.
Cross-chain bridging protocols are hard to interact with securely! The second bug was a real bad mishap yet an interesting note to have fork tests for deployment upgrades.

UTF-8 Explained- 1406

Wikipedia Reference →Posted 1 Year Ago

UTF8 is the standard variable length encoding format with over 1M possible characters. There are other standards for UTF like UTF1, UTF16 and UTF32 but this is the most well-used standard. A code point is a decimal representation of the character - such as U+0080. The actual representation in binary is based upon this value.
The first byte of UTF8 determines whether this should be 1-4 bytes long. For ASCII, the code points are 0-0x7F, meaning that nothing with a 1 above is valid ASCII. For the first byte for everything else, the amount of ones (followed by a zero) encodes the length. For instance, 110 would be 2 bytes and 11110 would be 4 bytes. Following this information, the next set of bits are encoded into the first byte, such as 5 available bits for the 2 byte sequence.
The next set of bytes depends on the previous setting. However, they will always contain a 10 at the beginning of the byte, which is a continuation byte. After this, the next 6 bits can be used for the rest of the code point.
As an example, U+00A3 is 11000010 10100011 in binary. It has 2 bytes, which is shown by the first two ones at the front. Then, it has a valid continuation byte and is followed by the rest of the data.
When encoding UTF, many of the byte sequences are not valid. Things like missing/unexpected continuation byte, undefined characters and many more are to blame. Additionally, how should this be handled? Should the invalid character be removed, left alone or what? What if we could between character sets? There are so many terrible issues that can come up if we're not careful. Finally, what does it mean to uppercase a unicode character? Some languages operate on a codepoint level while other operate on a character level, which can cause major problems.
From a security perspective, there are many things to consider. First, there are visual tricks that can be done with characters like the right-to-left change. Second, if there are different encoders at play then differences between the interpretation can be bad as well. The most important thing here is error handling - should we remove the entire codepoint, the invalid part or just error out? Different implementations do different things. Golang recently listed out some weird issues with their JSON parser, for instance.
Similar to case insensitivity, there is also case unfolding. This is more generic than lowercasing and goes throughout the entire unicode codepoint system. There is a list of case folding online as well.
Overall, a good exercise into learning about encoding issues!