Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

A Touch of Pwn - Part I- 1295

Blackwing Intelligence    Reference →Posted 2 Years Ago
  • Many laptops come with fingerprint sensors that are used with the Windows Hello platform. The sensors use the Secure Device Connection Protocol (SDCP) for usage. This protocol is used in order to ensure that the fingerprint device and communication between them is valid. The modules are loaded in via secure boot on the Windows side.
  • To create a secure connection, the host and device perform a key agreement to derive shared session keys. An attestation is sent over, confirming that this is indeed the proper device. Each fingerprint is set up with a unique ID that is associated with the scanner.
  • To identify users the host generates a nonce and sends it to the sensor. The sensor does the biometric matching at this point. If it's a valid profile, the unique ID of the user is sent over with a MAC using the shared secret.
  • Eventually, they decided to pick some targets. As most researchers should be, they prioritized things with more support, easier to reverse engineer and bad code quality. Their first target was the Dell Inspiron. To intercept the USB traffic, they used a Linux driver and added some additional functionality.
  • The Dell Inspiron on Windows did everythign correctly. However, the Linux side did not implement SDCP for whatever reason. So, a user could generate arbitrary unique IDs and ask them to be stored, unlike the regular flow where the host chooses the key. Their plan was to enroll the attackers unique key on Linux to be the same as the Windows box. In practice, they learned that the stores for these are different.
  • But, how does it know which database to use? By executing a successful MitM and modifying the type, an attacker can get the module to use their fingerprint! Since the IDs were the same, Windows thought it was valid and processed to unlock the computer. Originally, they just tested this in WinDbg. Eventually, they wrote a USB tool in Linux to intercept the traffic.
  • On the Lenovo Thinkpad, they rolled their own TLS stack instead of using SDCP. Weird! The client certificate and key are encrypted when going across the wire using the device name and serial number (lolz). After this, a TLS session is made. This can be MitM'ed, since we know the private key of the certificate in user. Eventually, they reimplemented enough of the TLS stack to pwn it.
  • On the Microsoft Surface Pro, it was a joke. They implemented no SDCP, no authentication and completely clear text USB communication. Any USB device can claim to be the proper sensor and it will be accepted. The process of authenticating the senor is a must on this.
  • What is better: passwords or biometrics? You see, the biometrics opens up an entirely new attack surface that we've never seen before. Although it's convinent, it is also terrifying. Overall, an awesome post on SDCP, hardware hacking, USB tool writing and so much more. One of my favorite articles of the year!

Halting the Cronos Gravity Bridge - 1294

Faith    Reference →Posted 2 Years Ago
  • This research was done in January of 2023 but was published recently. in September of 2023, Nathan Kirkland and I decided to do some auditing of the Gravity Bridge ourselves. So, interesting seeing the crossover here! The Gravity Bridge is a Ethereum to Cosmos bridge for various assets.
  • The Gravity Bridge is compromised of three parts: Ethereum smart contracts, an orchestrator which acts as both a relayer and signer and the Cosmos blockchain. When going from Ethereum to Cosmos, the sendToCronos function is called, which triggers handling within the orchestrator to vote on the event occurring on Cosmos.
  • When going the other direction, the message MsgSendToEthereum is sent. Once these are batched up, the orchestrator will query for transactions and sign it with their key. Once enough signatures have been found, it is relayed to Ethereum to call the submitBatch() function.
  • When we say bridge we really mean lock the original token in the contract and create a representation of the original token on the other chain. As a result, there is a function called deployERC20() to create an Ethereum representation of a Cosmos asset. Within the Gravity Bridge, this will trigger an event to store the token information locally.
  • When processing events, the lastEventNonce must increase monotonically in order to be processed. If this isn't the case in the orchestrator, then it will not be processed. So, can we break this invariant? By creating a token with too many characters in the name, an error will be returned. Now, all transactions will not be processed by the orchestrator, leading a denial of service.
  • The Gravity Bridge had many extra checks to ensure that the system wasn't in a bad place. If something weird happened then it would simply shut off as a defense mechanism. In particular, if the function k.Handle(xCtx, event) for a given event ever failed then the bridge would disable/turn itself off. So, the author decided to find a way to trigger this!
  • For the handling of a token send from Ethereum to Cosmos the function SendToCosmosEvent is called. Users can send arbitrary tokens with arbitrary values so this can be interesting to play with. One of the validations is that the token supply is not larger than 256 bits. If that's true, then the program errors out!
  • Since we can create our own token, we can trigger this. Simply send more than 2 ** 256 of a given token and the bridge will lock itself up. Both of these were rated as medium severity issues, which I disagree with. I feel these are high, considering they turn everything off for a while.
  • Overall, it's a great post! Personally, I didn't consider looking for functionality to hit the disable bridge code. So, it's cool to see research being done on similar targets to see ideas, setups and things to improve in the future.

Aptos Wormhole Vulnerability - 1293

Jeff    Reference →Posted 2 Years Ago
  • Wormhole is the largest cross-chain bridge. As such, it connects with many, many different blockchains and programming languages.
  • In Aptos, public(friend) functions are practically internal functions that cannot be called by the outside world. In particular, they can only be called by the same module or functions within the friend list.
  • The function publish_event is for a Wormhole smart contract emitting an event that triggers offchain code, such as a relayer, to process it. Unfortunately, when you add a modifier (code that runs before or after a function) to it, the public(friend) is now callable by anyone.
  • As a consequence, anybody can publish a token transfer event on the Aptos blockchain. Overall, a fairly simple bug in a weird contract ecosystem.
  • The amount at risk was at 5M because of the Global Accountant mechanism. Additionally, there is a limit on the amount of funds that can be taken out in a given week depending on previous usage. I find these defense in depth protections amazing! We need more things like this to prevent hacks from stealing billions.

Saving $100M at risk in KyberSwap Elastic- 1292

100Proof    Reference →Posted 2 Years Ago
  • KyberSwap is a CLMM that was implemented from scratch. Concentrated Liquidity Market Makers (CLMM) are market makers where the liquidity is provided only within narrow bands. This allows for higher capital efficiency and less impermanent loss for LPs.
  • CLMM price ranges are divided into ticks. Although they have a given number, they translate to the price of 1.0001t, where t is the tick. For a tick spacing of 10 in the price range $1.00-$1.22, you would deposit into the tick range (0,2000) because 1.00010 = 1.00 and 1.00012000 = 1.22. Each pool on KyberSwap consists of two tokens.
  • When using a CLMM, a swap by a trader causes the price to shift. When this happens, the liquidity tick for the price goes to a different location, resulting in a sub-swap to occur. The degree of price impact is determined by the amount of liquidity inside a given tick range.
  • There are three important invariants with CLMMs that should always be kept:
    1. Liquidity should never go below zero.
    2. When going between ticket range boundaries, the liquidity must either be increased or decreased.
    3. Should look like a normal distribution.
  • The vulnerability literally lies on an edge case. When handling the edge between two ticks, the boundary edge cases had a major issue. When performing a one-for-zero (WHAT IS THIS) swap, the nextTick() function will be calculated as the currentTick, even though it crossed a boundary. Practically, this allows us to double add liquidity.
  • How do we make money from this though? By getting our liquidity added *twice*, we can effectively steal funds from the protocol. Under the hood, the actual bug was in the backwards portion of the code. The first sub-swap cross the tick boundary, which is fine. On the second sub swap, the price difference is so small that is does not change. Hence, the sqrtP and nextSqrtP end up being the same, leading to a double add.
  • The author wrote a proof of concept using a flash loan from Aave to drain the entire protocol of all its funds. The hacker got a large payout and no one lost money. Except, later on, a variant of this bug was found that stole all of the funds. Overall, an amazingly complicated bug with a good write up from the perspective of the author.

Stealing tokens from O3 bridge users- 1291

Trust Security     Reference →Posted 2 Years Ago
  • O3 is a multi-service DeFi project with bridging solutions for 10+ chains. It functions as a fairly classic bridge: send tokens to bridge contract on chain A, then a mint the representation on chain B.
  • In the ecosystem, the aggregators is a role that attempts to find the cheapest path going from the source currency to the target currency when sending funds out to the bridge or retrieving them on the other side. There are various aggregators for crosschain swaps, same chain swaps and more.
  • In all of the functions, callers are required to approve() the source tokens aggregator contract so they can pull them to perform the swap. However, there is a logic flaw that can abuse the approve in the contract.
  • The variable callproxy can be used to change the routing of where the funds go. In particular, the caller() of the contract for the safeTranferFrom() can be changed to be any user! By changing this, the previously approved aggregator will send funds on behalf of another user to you.
  • The bug bounty submission process was quite sad. According to the Immunefi page, the max payout was 400K with a miniumum of 100K depending on current economics of the protocol. The O3 team claimed that since it did not work with a MAX allowance (which was used by default on the frontend) that this should be a medium instead of a critical. This shouldn't be the default on the frontend anyway and there are other ways to interact with the system besides the frontend.
  • Trust pushed back at Immunefi, which eventually led to O3 being removed from the platform. He calls out Immunefi for being too lenient on projects violating the SLAs of the platform, which I agree with. Personally, I found the callout a little harsh with "We want each O3 user to know they are trusting a project that gives 0 ***** about their security and more likely than not will be featured at some point on rekt. When that happens, I for one wouldn't be shedding any tears." Running a profitable project is hard; taking 100K from a project could bankrupt it.

Testing for audits: there is no spoon- 1290

3docsec    Reference →Posted 2 Years Ago
  • Everyone has their own auditing methodology. Read the docs, don't read the docs, start with code, end with ... At the end of the day, the goal is to find all of the bugs. Most importantly for payouts in contests, are the unique findings. This author gives us some insight into some of their recent success.
  • According to the author, projects with good tests have made them afraid to go hunt for bugs. They thought that if the code was really well tested then no bugs would exist in it. So, they decided to not look at the existing tests written by devs anymore.
  • Now, they wrote all of their own tests from from scratch. They do this by getting a basic setup with a happy path. To them, the debugging is the most important part. They learn about the proper states of the project, right/wrong inputs and much more about the codebase.
  • With a simple happy path going, they start writing their own tests to check for edge cases in the system. The blind spots of one person are going to be different than you, now that you're writing the tests. In a week long audit, they will spend the first 2-3 days writing tests and the rest of the time on code review.
  • I find this methodology to be pretty interesting but frustrating. I personally read the docs to try to understand the purpose and flow of the project first. By doing this, you're essentially reverse engineering a protocol to write tests, which feels wasteful to me.
  • The nice thing is that the uniqueness is what makes this work. If everyone did this then it wouldn't be nearly as effective. In the bug bounty space, niche tactics and weird vulnerabilities are the most important thing. Recently, they had 6 highs, including 2 solos at a Good Entry Code4rena contest. Thanks for the knowledge!

Usurping Mastodon instances - mastodon.so/cial (CVE-2023-42451)- 1289

scumjr    Reference →Posted 2 Years Ago
  • Mastadon is a decentralized Twitter-like replacement. Instead of having a single website, there are multiple servers that are individually ran. The instances communicate via HTTP requests with a signature to provide authenticity. The public keys for users can used to easily verify a user.
  • The signature validation works by getting this public key then verifying that the signature matches the user and domain. The search looks for the username (@donald) and the domain (@mastadon.com) to find figure out where to query the public key from.
  • However, the parsing of the domain and the username is busted. When parsing the domain, all slashes are removed from it! So, the domain mastodon.so/cial would become mastodon.social when it is parsed. This allows for the spoofing of requests from arbitrary users across different servers.
  • To exploit this, an attacker would need access to a tld that's different than the actual domain but close to it. For instance, a user with mastodon.so could spoof into mastodon.social. They used this to send private DMs as other users, which is pretty fire. Great bug find!

Unrolling the Scroll: Probing the Security of a ZK Roll Up- 1288

Offside Labs    Reference →Posted 2 Years Ago
  • Scroll is a zero knowledge (ZK) roll up layer 2 blockchain. The idea is to roll up loads of Ethereum transactions on a different blockchain back on to Ethereum. Then, to crank up the privacy, add in a zkEVM to ensure that nobody can see what's going on, while still be provable.
  • The zk rollup system has three phases:
    1. Transaction execution
    2. Batching and data committment
    3. Proof generation and finalization
    With this structure, if a batch is unprovable, then it would block the whole network. This could lead to hardforks or other bizarre issues. Lag on provability can cause problems as well.
  • The computational power required to do the proving is so expensive that it was not worth setting up. So, they decided to look into the bus mapping module instead. This is responsible for parsing transaction traces generated by L2Geth and converting them into witness data for the zkEVM. By modifying some tests from the project, they got libfuzzer setup. Eventually, they got a crash from doing this.
  • The bug was an out of bounds read within the function get_create_init_code, which can be triggered from the CREATE2 opcode. This is used for finding deterministic address without doing an actual deployment, consisting of a 0xFF, account address, salt provided by the user and bytecode of the contract being deployed.
  • The crash occurred when the CREATE2 opcode was executed twice. This is interesting because it shouldn't be possible to submit two contracts to the same address. The second creation fails due to the contract address collision. During a revert, the proper memory is not allocated to a given user. Since memory expansion occurs but there's not memory that is allocated, a crash occurs.
  • All excited, the team decided to test this attack against a local test node. But, nothing happened. Why? Good design by the Scroll team! They added a circuit capacity checker (CCC) to handle these sorts of cases. In particular, if a transaction is too expensive or crashes, then it is rejected. The execution cost is evaluated before execution. So, unknown attacks can be closed out before hitting the platform.
  • Since the bug was not possible to exploit because of the defense in depth measures, they didn't get a pay out. The Scroll team fixed the memory corruption bug though. To me, if there's a defense-in-depth measure that blocks an attack, a payout should still be made but just a lesser amount. If there's XSS that's blocked by the CSP, you pay for the XSS! The CSP bypass is a separate security finding to me. Regardless, good write up!

Supporting the Smart Contract Vulnerability Research Community- 1287

Chainlink Labs    Reference →Posted 2 Years Ago
  • Chainlink is a network used by many, many blockchains for several things. It provides oracles for prices on tokens, random numbers and much more.
  • As such a major part of the ecosystem, they take security very seriously. They have the best of the best audit their software and have a very big bug bounty program on HackerOne and Immunefi. They've gotten audits from Code4rena and other top firms.
  • Trust (OG auditor) and another researcher Zach (LSR at Spearbit) found a very niche flaw in the Verified Random Function (VRF) system. When generating random numbers, the flow works as follows:
    1. Request randomness to the Chainlink contract. This emits an event that will be acted upon.
    2. A callback from Chainlink is made to deliver the random number with a proof.
  • A subtle but important thing is that the the random number sent should be the only one sent. Why is this important? If a user can force a redraw arbitrarily, then the system becomes unfair. For instance, if a user doesn't like a number, then they can just re-request the randomness until it's favorable. With bad setups, this can be an issue with Chainlink.
  • The issue is that the subscription owner role within a Chainlink can block randomness from coming in then force a redraw. This role is typically reserved for a member of the hosting DApp, making it a very privileged position.
  • The hackers were given a 300K bounty from Immunefi for the critical finding. To me, having a privileged role being able to redraw randomness doesn't feel like this big of a finding. However, considering this is Chainlink which supports many use cases, they want to ensure that in a completely decentralized application that a single role cannot abuse Chainlink. Good write up!

lateralus (CVE-2023-32407) - a macOS TCC bypass- 1286

Gergely    Reference →Posted 2 Years Ago
  • MacOS has two many things going on for its own good. It has way too many things to analyze statically. So, the author creates a tool to pick up FDA entitled apps and run a syscall trace on them. When looking for items reading files and env variables, he noticed some scary hits. The article is about a scan that led into a bug.
  • The ENV variable MTL_DUMP_PIPELINES_TO_JSON_FILE is a Metal framework variable used by various MacOS programs. It opens a file on the current application and writes data to it. Pretty simple!
  • How does this work? Courtesy of the fs_usage command:
    1. A file will be opened using the open() syscall on a temporary file.
    2. write() is called to write to this file.
    3. rename() is called on the temporary file to name it back to the path we control.
  • rename() in place is not a safe function. But why? There's a race condition that occurs between the open and copying of data. There is a classic time of check vs. time of use (TOCTOU) bug on this call. By changing the file to a symlink to something else at the right time, we can cause major havoc!
  • Even better, we can control the log data being written by catching the tempfile creation when it occurs. So, when the renaming occurs, we control the data being written in the file. Between the data controlling and the renaming TOCTOU issue, we can write to an arbitrary location with arbitrary data. Pretty neat!
  • How does the author go about exploiting this?
    1. Create a symlink that points to the Apple TCC directory.
    2. Create a directory at an attacker controlled location.
    3. Set the vulnerable ENV var to a file in our temporary directory with the vulnerable app running.
    4. Catch the open() of the temporary file in the directory and write our malicious TCC database to it.
    5. Switch the information in the symlink over and over again until the execution occurs.
    6. Wait and see if we successfully won the race.
  • With some luck, the TCC.db file was overwritten with our own! It's a pretty slick bug that exploits complexity within the rename syscall. Apple fixed this by removing most of the Metal ENV variables.