About Project Blog Resources

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Subverting Apple Graphics: Practical Approaches to Remotely Gaining Root- 1575

Marco Grassi Reference →Posted 1 Year Ago

Sometimes, the vulnerability is not the issue itself - it's finding a fruitful attack surface. In this paper, the authors discuss the overlooked OS X attack surface of the userland and kernel graphics components. The reason these were likely overlooked was because they are not listed in the sandbox profile. The WindowServer also has very high privileges on the OS.
CVE-2014-1314, not found by the authors, was a funny design flaw. When creating a CoreGraphics session, it will send a request to start a process under the user's context. However, user's can specify an arbitrary script to run upon login. So, this creates a process that is outside of the sandboxing. Neat!
The first interesting bug they discuss is a double free. This is triggered when setting an invalid ForceConfig for touches. The time window for the two frees is very small. Luckily, triggering the bug without winning the race doesn't crash, allowing for a brute-force attempt on this. The CFPropertyListCreateWithData API takes in Unicode strings, which was a good target for overwriting the memory we wanted. An additional win is that the randomization on large blocks of memory is very bad.
The next target is the IOAccelSurface interface used by Apple's Graphics Driver. It represents an area of a rectangle that will be rendered by the GPU. It appears to have been designed for WindowsServer to use but normal processes can also call it. Hence, it's likely a fairly software target. The article discusses the internal workings of this driver, which required a lot of reverse engineering to do.
When doing some rectangle scaling, there is a lack of input validation for sane values. The incoming surfaces height is expected to be less than 0x4000. By providing a value larger than this, the condition of y 16705, x 321, height -1, len -1 is possible to hit. Before bailing out, a single out of bounds write will occur.
Using linear out-of-bounds write ahead of the current chunk, we need to corrupt something useful. If an IG Vector can be placed ahead of this chunk, we can create a fake IG Vector with pointers to sensitive locations. Also, we control the values being written in these locations, giving us an arbitrary write-what-where primitive. Since floats have a limited range, this does limit the values we can write.
In effect, this gives the attacker a relative pointer corruption. Although we can't overwrite the whole thing, we can overwrite parts of pointers. Additionally, these values are good for corrupting a structure's size. In this case, the authors used the OOB write to corrupt a IOUserClient objects pointer. The goal was to corrupt this to point to data that the attacker controls on the heap. Getting this object in the proper spot via feng shui is discussed as well, and it's worth a read!
The OOB write is used to get both an infoleak via reading VTable pointers. The trick was to change the offset to be read out to read the pointer byte by byte. Within the same object is a VTable pointer that can also be overwritten to achieve RIP control by pointing it to a spot on the heap that we control.
Even though the paper is 8 years old, there is still so much to learn. Good paper!

Millions of Accounts Vulnerable due to Google’s OAuth Flaw- 1574

Dylan Ayrey Reference →Posted 1 Year Ago

OAuth is a common way that websites do authentication. Google OAuth is a identity provider that websites can use to not handle usernames, passwords and such.
When you click the Sign in with Google button, Google sends the application a set of cryptographically information about the user in a JWT. Of these fields, is the hd (hosted domain) and email. The application would verify both of these claims to be correct before logging in.
Here's an interesting question: what happens when the domain is owned by somebody different? Apparently, there is no verification on then. The sub field could be used for this but it's inherently unreliable on Google OAuth for some reason.
By purchasing the domain of an old startup, it's possible to login to these other accounts. You can't see old emails in Google workspace but you can access accounts from previous employees.
Initially, Google decided to not fix the issue. They categorized this as an abuse issue. After the talk got accepted to ShmooCon, they decided to pay $1,337 for the bug. They rated this as low likelihood and high impact, which I think is a fair assessment. They claimed they are working on a fix but nothing has been shared yet.
The author brings up a good point about password resets. If an attacker controls the email then they could reset the password on an account as well. Unless the only 2FA is on the email, this will not work because of things like Google Authenicator and SMS 2FA.
To me, they try to hype up the numbers a little much. They claim millions of accounts but are making assumptions about A) the amount of startups that use Google workspaces, B) each has 10 employee, and C) each has 10 accounts. Although it's technically correct, it feels overhyped. Alongside this, who cares about data of old startups? Not many people. Regardless, it was a cool bug and I appreciated the write up.

Solana Reading List- 1573

mertimus Reference →Posted 1 Year Ago

A list of various Solana articles. Range from Solana internals, MEV, validator setup and more. Just a good resource to have if you want to read about Solana.

TMI — Too Much Information. The less you reveal the better!- 1572

aleksamajkic Reference →Posted 1 Year Ago

Resource enumeration is the process of extracting the existence of a resource, especially usernames, from an application. By itself, it's not a big deal. But, it is often required to further exploit systems. As a result, many people do not care about the vulnerability. This article is going to touch on this bug class.
Sometimes, the data is obvious. For instance, @dooflin5 on Twitter is my handle, and it can be seen easily. In other cases, it's more subtle. A different error message being returned on login if the email exists in the system or not can be enough to disclose this. Besides information disclosure, trying a lot of logins can also be used as a DoS vector.
The author found this vulnerability on some websites. The company said it's a known design feature. So, what's wrong? It's a user experience thing. If the user can't remember their username, it becomes hella annoying to use your website. Generally speaking, the less information you give to the attacker, the more secure the system is going to be, but the harder it is for the end sure to work.
In practice, things like password reset and logins should have good rate limiting and captchas anyway. This prevents the automation of this exploit but the core issue can be used to try to guess small amounts of usernames by hand. Good read on assessing the design tradeoffs with it.

Solana MEV Report- 1571

Helius Reference →Posted 1 Year Ago

Maximal Extractable Value (MEV) is the value that can be extracted by manipulating the transaction sequencing. By adding, removing and/or reordering transactions within a block, it's possible to profit from it. This article is a perspective on Solana MEV over the last 4 years.
The first big change in the Solana MEV space was the introduction of the JITO-Solana client. By allowing block-searchers to choose what transactions go into a block in a democratic way with a bidding process, the validators and searchers are able to profit from it. The JITO-Solana client makes up 92% of clients now. In March of 2024, Jito decided to shut this down because it was hurting average users.
Even though JITO is now gone, there are other similar products to it. Much of Solana sandwiching attacks originate from a private mempool called DeezNode. In May of 2024, the scheduler algorithm was made more stable, which prevented leader spamming attacks too.
What type of MEV is profitable? The classic one is arbitrage. You can buy something at $1 and then sell it for $3; you make $2. By changing the cost of things with sandwiching, you can arbitrage users to make large profits. Another form of MEV is performing liquidations on lending protocols.
An insight they show is the amount of reverted transactions on Solana. Since a single user can only hit a MEV opportunity, all others will fail. This bot spam counted for 75% of all non-vote transactions in the network, leading to major issues in throughput. The new scheduler has majorly improved on this problem, though.
Jito attempts to keep track of MEV in the Solana network through their mempool and elsewhere. According to the report, 50% of sandwich attacks come from a program owned by DeezNode. On average, they make 51K transactions in a single day at 2200 SOL per day. Jito shutting down their mempool was nice. However, unless there's a built in way to stop this, some actor will decide to do this.
How do we stop MEV then? A scary way is validator whitelists. Jupiter, a retail trading platform, introduced dynamic slippage to optimize settings in real-time. Another one that is common by AMM's is to route transactions to only go through the Jito block engine without sandwiching.
RFQ (request for quotation) is an off-chain pricing mechanism where market makers bit for prices, with only the final quote being used on chain. JupiterZ and Kamino swap do similar things but follow this underlying principle. Along the same lines are Sandwich-resistant AMMs. No swaps are executed at a price more favorable than the pool's price at the start of a slow window. Because of the long periods of time that prices stay, sandwiching is impossible.
Similar to SR AMMs is Conditional Liquidity, which depends on a new user called "Segmenters" to evaluate the incoming toxicity of the order flow. By adjusting the spreads based on this, we can prevent the profitability of the MEV folks. The spread is paid back as compensation for the Segmenters role. Paladin-Solana aims to be a trusted priority port that can be used but punishes folks who MEV on it.
Overall, a good article that explains the wild world of MEV in the Solana ecosystem. Maybe one day we'll get rid of it.

uncontained: Uncovering Container Confusion in the Linux Kernel- 1570

Jakob Koschel - VU Research Reference →Posted 1 Year Ago

Type confusions are a bug class that operates in both memory safe and memory unsafe languages. In C, type confusions typically lead to bad memory corruption bugs.
The main part of this paper that I enjoyed was around C class and type hierarchies. C doesn't technically have type hierarchies but it's possible to create a similar effect using structure embedding. For instance, you can have a type and then a child field at the end of that type. To go between the parent and child parts of the object you can just do some pointer math.
The Linux kernel uses the container_of macro to do this a lot. According to the author, this technically violates the C language standard and is always an unsafe cast. The goal was to find cases where the casting into the container (child) type is incorrect, leading to a type confusion bug they call container confusion.
In LLVM, they created a custom compiler pass to spot uses of container_of in the source code to create a type system. This tracks all casts up and down. From there, they built a custom sanitizer called uncontained in order to detect casts up then back down to the wrong type.
An interesting design decision was checking at the time of use vs. the time of the incorrect downcast. They found several scenarios where the downcast was safe through only accessing the parent field on the downcast.
In the Linux kernel, they found 37 cases of container confusion. Of these, 16 were false positives, 11 were unique bugs and 10 of them were anti-patterns of checking the container confusion later in the small section of code they looked at. Besides simply downcasting to a static container, they found a few other types of bugs:
- Empty List Confusion: In cases of list being used but empty, both the next and prev fields will point to the object itself.
- Mismatch on Data Structure Operators: Different locations in good may treat a pointer as a different type depending on the needs. Of course, offsets must be correct in this case.
- Past-the-end Iterator: Break-like logic is often used by searching for an element in a data structure until the end. It's possible to use the iterator without checking for its validity.
- Containers with Contracts: An object may come with additional metadata that program semantics use to control what operations can be done on it, such as the sysfs kernel subsystem. If these invariants are not kept, it leads to a mis-use of the pointer.
The sanitizier is not meant to have a 100% positive rate. Instead, it's meant to point out potential locations and types of the bug. To me, this is completely reasonable as long as the false positive rate isn't too high. They added all 5 locations to find a total of 80 bugs, 179 anti-patterns and 107 false positives. Most of the false positives came from the first pattern that had explicit tag type checks within the code. Overall, a real bug 30% of the time is pretty amazing!
To me, this is absolutely amazing work. Taking a known bug class in the Linux kernel (and some other code) and writing a fairly accurate static analysis tool is awesome. 80+ in the Linux kernel at a time is unheard of in modern days.

Dependabot Confusion: Gaining Access to Private GitHub Repositories using Dependabot- 1569

Giraffe Security Reference →Posted 1 Year Ago

Dependabot is a Github bot that automatically updates out-of-date dependencies by making PRs. It's a super useful feature for maintaining up-to-date dependencies.
In NPM packages, most of these only use a version. However, it's also possible to use a github.com link with a branch name. This is commonly used for private packages as opposed to setting up an internal NPM registry.
Dependabot will attempt to update all dependencies that it knows about in the public repo to the most recent version. To the authors surprise, they did not have any special cases for github repos!
So, if you registered the name of a package publicly, then dependabot would attempt to replace the private version of it. In order to do this, you would need to guess the name of a private-internal repo. Sometimes, this information can be leaked though.
To fix this, dependabot removed the git dependency to NPM public registry mapping that it was trying to do. Clearly, there was an issue with this. According to the author, Bundle and NPM were both vulnerable to this. Obviously, this leads to an RCE if the dependency is added.
Overall, a good bug in some important code! Although many of the dependabot PRs must be approved, it's easy to overlook this. Additionally, some repos use the auto-merge workflow for dependabot, making this easier to exploit.

From Arbitrary File Write to RCE in Restricted Rails apps- 1568

Conviso Reference →Posted 1 Year Ago

The premise of the post is having an arbitrary file write in Ruby on Rails. The twist was that the Dockerfile had the application run as a non-root user with only some directories being owned by the executing user. The goal was to get an RCE in this restricted environment.
Given the situation, the natural thing to do is to recreate the environment yourself to see what's possible. One of the directories that could be written to was /tmp. Ruby has a framework called Bootsnap that allows for loading Ruby/Rails Apps faster via caching. Much of the configuration and cache for Bootsnap is stored within /tmp/cache/bootsnap.
Upon reviewing the contents, they noticed that load-path-cache contained gem file paths in to the MessagePack format. Additionally, comiple-cache-* contained compiled Ruby, JSON and YAML. From there, they decided to review the source code of Bootsnap to get an idea of what made sense to corrupt.
The Bootsnap startup went as follows:
1. Bootsnap is loaded from config/boot.rb.
2. Load path caching. For every require, Bootsnap checks the cache first.
3. Compile the cache. Bootsnap caches the compiled Ruby code from the previous steps and stores them in a directory containing a hash of the file.
The object of this attack was to overwrite one of the cached compiled Ruby binaries then trigger an application restart. The bulk of the cache file contains information about the version and where it should be loaded this way. Nartually, this can be spoofed and set to an arbitrary value using the originally vulnerable. So, RCE is achieved!
How do we restart the server though to load our corrupted cache file? The Puma server will automatically restart if anything is written to /tmp/restart.txt. The arbitrary file write can be used to write to this file a second time to trigger the RCE bug.
I really enjoyed this blog post! Taking a library and explaining how to abuse its quirks was an awesome use of time. I bet many people will use this in the future for their endeavors.

A Realistic Breakdown of Optimism - Part 1- 1567

Trust Security Reference →Posted 1 Year Ago

TrustSec has contributed to the Optimism ecosystem both in contributing to contests, and audits and two paid bug bounties. In this post, they talk about the security of Optimism and some of the more interesting bugs they discovered. There is a lot of background one the one bug they go through in the blog post.
L2s need a way to communicate with the L1s and vice versa. In Optimism, going from the L2 to the L1 requires a Merkle proof from the trusted L2 state root. When going from Ethereum to Optimism, OptimismPortal is used which emits an event that is translated to an ETH minting call. Both of these give the capability to send arbitrary data between the chains.
There is a limitation to what's above though: calls can become stuck. So, the CrossDomainMessenger (XDM) is where the other bridging functionality, like ERC20/ETH bridge and the ERC721Bridge are implemented. XDM supports resending failed messages via a mapping of either successful or failed messages.
On the revert flow, there is a very important check... If the XDM failed then the ETH is has already been supplied. Additionally, the successfulMessages mapping prevents the same withdrawal from being executed more than once.
When making a call from the L2, the default L2 sender is 0xDEAD. The variable is xDomainMsgSender. On a cross-chain message, this is set to the calling user. In effect, this acts as a reentrancy protection as well. At the end of the call, the storage value is set back to the default.
The audit they did was for the SuperchainConfig contract upgrade. This was a single program that allowed for designated roles to pause or unpause a core contract. During this upgrade, they manually switch off the initialized bit of the contract to allow for recalling initialize() instead of using reinitializer modifier. Such a small line of code seems so simple. So, what's wrong?
The xDomainMsgSender is 0xDEAD at all times except during the withdrawal process. Within the initialization code (which gets retriggered), this value is set to 0xDEAD but actually defaults to zero. Normally, this would be fine (since it should be a NOP) but that's NOT true in the context of the withdrawal code!
Upgrades are permissionless once enough signers have agreed to the upgrade. Here's the flow of the attack:
1. Store a failed delivery to the smart contracts failedMessages mapping.
2. Wait for the upgrade to be in the mempool and ready to go.
3. Reattempt the failed delivery:
  1. Perform the upgrade with the parameters. Now, the msgSender is set to 0xDEAD.
  2. Reenter into relayMessage passing in the withdrawal request. This will succeed because the DEFAULT 0xDEAD address was set back.
  3. The msgSender is set to L2Sender again when it shouldn't be.
4. A double withdrawal has occurred because of the double setting of the xDomainMsgSender global variable. The stolen amount comes from any failed withdrawals.
The setting of the xDomainMsgSender global variable doesn't seem important until you have the better context of what it does and why it's important. It's crazy how this reentrancy/replay protection was touched by this simple upgrade code. What an awesome find!

"Invariant inversion" in memory-unsafe languages- 1566

sha1lan - pacibsp Reference →Posted 1 Year Ago

The author begins the post with an invisible C bug. After staring at the code for a while, I couldn't find it. The bug is simply that a boolean could have a value other than 0 or 1. Why does this matter though?
In memory unsafe languages like C, the invariants used to uphold memory safety are programmer-created invariants. By breaking these assumptions of the program, safe-looking good can be broken via subtle memory unsafety issues. This is the main concern of the post.
Why does having a boolean that is not a 0 or a 1 matter? Because it's a boolean, the compiler assumes that this byte will only ever be a 0 or a 1. Because of this, it will make some optimizations around this. When it's a non-binary value, this breaks the logic of the optimization and leads to memory corruption in the program.
In a typical C codebase, you would look for memory unsafe accesses in things like keep[index] that actually perform the access. The author compares this bug to reviewing JIT compilers. They try to enforce invariants early on in the program then the rest of the code assumes that this invariant is true. If the invariant is ever violated, then you have a memory corruption bug.
According to the author, the similarity is that the memory safety violation does not come from the exact line of code like with a bad access. Instead, it's the violation of an invariant that another part of the code relies on further down the line.
This is the invariant inversion. Languages can create chains of invariants leaning on other invariants leaning on other invariants... until it's a crazy mess and web and invariants. Because of this, breaking a single one of these upper-level chained invariants can have much larger consequences than you realize. Unfortunately, managing this web of invariants in your head is impossible to do because it becomes a huge graph quickly.
In the case of the bool-typed variables only having a 0 or a 1, they consider this an inverted invariant because it's "higher level" than memory safety yet it is relied upon for "lower level" safety properties later.
Why does this all matter? It's a new way of finding bugs! Currently, we are asking ourselves "where is the memory unsafety occurring at", which is only relevant in languages like C. Instead, we should be asking ourselves "where was the first violation of an invariant relied upon?" This different view of the world seems more reliable since it's finding the safety bug first rather than backtracing where the bad access could occur at. Great post!