Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Reptar- 1285

Tavis Ormandy    Reference →Posted 2 Years Ago
  • The rep movsb instruction is a super common way to move around memory in x86. The destination, direction and amount are all set in this call, but the processor does stuff under the hood.
  • In x86, the instruction decoding is very relaxed. Sometimes, compilers use redundant prefixes to pad a single instruction to get a nice alignment boundary. There are several prefixes that can be used, such as rex, vex and evex. On i386, there are only 4 registers which were encoded in the instruction. When this was doubled to 8 registers, there was no where to go.
  • So, the rex instruction adds an additional byte to the beginning of the instruction to encode this information. If this is found before an instruction like movsb, then it's silently ignored. Well, in most cases. The fast short repeat move instruction; the feature is all about moving small (less than 128 bytes) strings quickly
  • To test for architecture level issues, the author of this post uses Oracle Serialization. This generates two programs but transforms it to include micro architecture changes like fencing instructions. If the state of the program after serializing it is different, then something weird has happened.
  • While fuzzing using this technique, they noticed that adding redundant rex.r prefix instructions to an FSRM optimized operation caused unpredictable results. For instance, branches to random locations, branches being ignored and many other weird things. Somehow, this had corrupted the state.
  • Within a few days, they found out that triggering this on multiple cores led to exceptions and halts. Within an unprivileged guest VM, this could be used to crash the computer! So, what's going on?
  • The CPU has two main components: frontend and backend. The frontend fetches, decoding and generates the ops for the backend to execute. The backend then executes these instructions. The authors of the post think that there is a miscalculation in the movsb instruction size, which leads to extra backend entries to be processed.
  • Is this exploitable? Probably! However, there is no insight into what's being processed under the hood. So, the information above is just a guess from the author. Awesome post once again!

Retrospecting Unhealthy Order Allowance Vulnerability in Perpetual Protocol- 1284

ChainLight    Reference →Posted 2 Years Ago
  • Perpetuals are a type of trading that is speculating on the price of an asset after some amount of time in the future. The price can either be bet on going up or down. The vulnerability is in the calculation process of the indexing of the pricing.
  • Typically, the index price refers to the average price of the underlying asset. The mark price is the current price offered by the exchange on the future being traded. Within the Perpetual Protocol, these are different though. Index price is the current value of the spot asset, which uses a TWAP. The Mark Price is the most recently traded price value.
  • In future exchanges, the size of the position is limited by the user's initial margin (debt placed in). Otherwise, the user would have bad debt, leading to insolvency in the protocol. So, any vulnerability that can achieve loads of bad debt is bad.
  • Perpetual protocol did not control this with the method above. Instead, they calculated the value of all positions and allowed orders based upon their index price. Since the index price is somewhat manipulable, this becomes a problem! Raising the price, shorting, then dropping the price could lead to large losses in the protocol.
  • How feasible is it to manipulate a pool? They looked at many of the pools and determined that the vMATIC-vUSD was likely the most manipulatable. The process for hitting this issue is fairly complicated with four accounts. Here's how it goes.
  • First, account 0 in creates a massive sell of spot tokens to drive the mark price to fall to 0.8A. The maximum allowed price change is 20%, due to some existing defense in depth measures.
  • Second, account 1 opens up a short position at 1.2A, again, at the maximum amount being 20% manipulation. At this point, account 2 places a long position on the price at 0.8A to a maximum 1.2A through a massive purchase of the spot token. On this step, very large unrealized profit is generated for account 2.
  • Account 3 opens a long taker at the price of 1.2A as a counterparty for account 1, executing the malicious short taker order at this price. Account 2 closes its long position to realize its profits. To me, the key is that since the price is manipulable. This results in a positive gain from both the long and the short. Doing this over and over again (once per minute) could have stolen most of the money from the protocol.
  • On Immunefi, the mediation process went south. The reasoning from Perpetual Protocol didn't make any sense and they offered 5K for a medium instead of 250K for a critical. Eventually, after months of work, they moved this to a critical with a 10K bounty. It seems like specific market conditions had to be meant for this to work but I don't fully understand them.

LayerZero's Cross-Chain Messaging Vulnerability- 1283

Heuss    Reference →Posted 2 Years Ago
  • LayerZero is a universal cross chain messaging (CCM) protocol. By having a LayerZero smart contract deployed on a chain, assets can be transferred between chains. A relayer is an entity that submits cross chain transactions. A particular user application can choose the relayer and oracle they would like to use.
  • The author goes through a previous vulnerability within LayerZero. In this post, the author noticed that when emitting an event the relayer address was not being included. So, they were curious if there were implications for this.
  • By setting the price fee to be 0 on their own oracle/relayer then switching to the default LayerZero versions, zero fees will be incurred. Being able to not pay fees is bad. Interesting bug by itself!
  • While trying to address this issue a modification was made to the protocol. When calling setConfig() function to update the oracle/relayer information, the relayer should refrain from relaying the message on this same block.
  • The relayer failed to check who sent the update to the configuration. If an attacker sends a message to a regular UA then a malicious UA calls setConfig() within the same transaction then the message will not be relayed.
  • This is very severe! The outbound nonce is incremented with each message, making it not possible to get the message relayed with more time. Originally, they reported this as a critical vulnerability. However, the development team has a way to force send transactions that were not originally relayed, bumping this to a medium severity bug.
  • A good design from the LayerZero team! I like to think they thought of threats and came up with solutions for many of these threats. In this case, they probably thought about a relayer not processing a message, dropping it entirely to build out this solution. Smart design of protocols creates situations where critical vulnerbailities are now recoverable, which is super cool.

Optimism Censorship Bug Disclosure- 1282

iosiro    Reference →Posted 2 Years Ago
  • Optimism is an L2 blockchain. The idea is that Ethereum is too slow and too expensive. So, if we rollup a large amount of transactions into a single transaction sent to Etheruem, the gas cost can be shared between them, making it such cheaper.
  • A sequencer is a program that takes in the proposed transactions and submits them to Ethereum. Of course, lots of proofs and things are done prior to this. In front of the sequencer is a load balancer that rate limits the traffic coming in.
  • To detect the number of attempts, the rate limiting is calculated based upon the signed transactions per account within a given time window. To prevent censorship, the transactions are discarded if the nonce is lower than the accounts current nonce. Source IP rate limiting is done as well.
  • Rate limiting is a great feature to prevent network spamming. However, this has logic that can be flawed as well. If developers are not careful then this feature can be used against the system. In this case, the program was not checking the chain id!
  • So, if the other chain had a nonce that was higher than Optimism, then it was valid for the rate limiting. Down the road, EIP-155 would reject the transaction though. Regardless, it would still trigger the rate limiting functionality. By taking transactions from another chain, a user could be arbitrarily rate limited indefinitely.
  • Specific accounts in the network have special permissions or are really important other parts of the ecosystem. LayerZero being taken down, censoring of bridges, ProxyAdmin changes and many, many things would be broken. Additionally, this could allow for strange edge cases in the system by choosing when transactions go through and when they don't.
  • The authors of this request rated this as critical. Considering that any user could prevent any transaction, I understand that. However, this would be identified and fixed within a few days after reviewing the logs of the proxy. The Optimism team decided this was a medium risk finding in the end. Sadly, this was marked as out of scope, which I hate.

Oh-Auth - Abusing OAuth to take over millions of accounts- 1281

Aviad Carmel - Salt Labs    Reference →Posted 2 Years Ago
  • OAuth (Open Authorization) is a standard authorization protocol. It is used all over the place with SSO providers to allow for a trusted entity, like Google or Facebook, to authenticate you to other sites. However, there are many footguns with this.
  • The flow for OAuth is as follows:
    1. User tries to login to some site. It wants proof of their identity via an SSO provider, like Facebook. So, a redirect is made there.
    2. Upon redirecting with a logged in SSO provider user, a secret is passed to the user. They are then redirected back to the main website.
    3. The website will take the secret and communicate with the website on who they are. Now, the identity can be made to get information like their email and other things from the SSO provider.
  • On Vidio, there was an issue with the verification of the access token for the redirect back to the main website. When Vidio would make a request to Facebook (the SSO provider in this case), there is an app identifier (AppID) for each app. However, it is the responsibility of the website (not Facebook) to ensure that the token belongs to their app.
  • So, by providing a token from another Facebook app that the user controlled, they can return an arbitrary email, which results in an account takeover. This same attack method worked on bukalapak as well. In the case of Grammarly, they used an auth flow that was not vulnerable to this issue by default. By brute forcing parameters they were able to find a flow that was vulnerable to the method mentioned above.
  • The SSO providers have custom attacks, which is super interesting. To me, it makes sense to force the app developer to specify the AppID instead of requiring manual verification; this is done on one of the Facebook flows already. Considering this, I'm sure many other providers and websites are vulnerable to this attack. Good vulnerability description!

A short note on AWS KEY ID- 1280

Tal Be'ery    Reference →Posted 2 Years Ago
  • With AWS access keys, there are two mandatory parts: the key id and the secret key. The format of the AWS access key is actually predictable, which is super interesting!
  • The first four characters are a prefix for the type of key. This depends if it's for a role, a certificate, a regular access key or something else.
  • After this, there is 16 bytes. If you base32 decode this you end up with 10 bytes. The account ID is encoded within the first 5 bytes of this but shifted by one bit. The author wrote a script that decodes the account given the key.
  • The rest of the 5 bytes is still unknown. I'm guessing it's random data to ensure that the key is unique.

Helping Secure BNB Chain Through Responsible Disclosure- 1279

Felix Wilhelm    Reference →Posted 2 Years Ago
  • The BNB Beacon Chain is the governance and staking layer of the BNB Chain. They use a fork of the Cosmos SDK with many modifications.
  • One of the more sensitive parts is the coin type. In the original Cosmos SDK, it uses a safe bigInt wrapper instead of native types. However, in the fork, they use the int64 type for efficiency reasons. Because of this, integer overflows and underflows are possible when not checked.
  • The message MsgSend is used for simple 1-to-1 token transfers with multiple outputs. To prevent theft, a loop is performed to ensure that the amount being sent is enough for what the user possesses. Verification is done to ensure that the inputs of the system match the outputs of the system.
  • Using integer overflows, the verification above is trivial to bypass. In particular, we can send out way more tokens than we own by making the inputs and outputs match after the overflow. This results in the ability to create tokens out of thin, breaking the blockchains security.
  • The solution was to patch their fork of the library to not allow overflows in the future. Overall, a fairly simple vulnerability in a popular project.

Rate manipulation in Balancer Boosted Pools — technical postmortem- 1278

Juani    Reference →Posted 2 Years Ago
  • Balancer V2 is a key lending and borrowing protocol with lots of interesting functionality. Within V2, arbitrary contract is capable of being a vault; this is to maximize innovation and flexibility. The batchSwap() function can be used to perform multiple swaps atomically to get the best path. This also contains a flash swap by only having to pay for the funds at the end.
  • Balancer was trying to be as capital efficient as possible. In a pool, the ratio is what calculates the price of the token. To have stability, lots of tokens are required; this ends up with a large amount of idle tokens that are doing nothing useful. They tried a few different things to fix this but settled on linear/boosted pools.
  • Lending protocols have an underlying token in exchange for the platform/LP token, such as aTokens on Aave. From this, users earn yield for liquidity. Since these are always rebasing, a wrapped version of these tokens in a Balancer vault is used. To prevent the constant wrapping and unwrapping of assets, they created linear pools.
  • Within a linear pool, there is a third token: the Balancer Pool Token (BPT). Pools that contain tokens that are also pools themselves are called composable or recursive. What's cool about this is that the BPT can be swapped like any other token in the pool itself.
  • Now, for the vulnerability! The issue existed in the common library ScalingHelpers. In DeFi security, the rounding direction is critical to get right for security. Any rounding errors should always favor the pool. To be efficient, they decided to always round down and expect that the consequence would be minimal.
  • This was true for everything except linear pools. A bunch of crazy things coming together made the rounding error significant:
    • Linear pools have zero fees when balanced and no minimum balances.
    • Initialized with pre-minted BPT, creating a near infinite supply. This is available for flash swap operations.
  • Why is this significant? The batch swaps settle at the end. Individual swaps perform calculations on scaled balances - including rates - which depend on the intermediate pool state. Since this doesn't work based upon the vaults pool balances, the math is deeply effected. There is a quick and dirty attack path:
    • Borrow BPT via a flash swap with the rate slightly greater than 1 and trade it for main and wrapped to reduce the token balances to near zero.
    • Craft a trade that exploits the rounding error from above to make the total balance equal the virtual supply. This will result the rate to 1.
    • Repay the flash swap at a new lower rate for a profit.
    This can also be done on the main and wrapped tokens as well.
  • This is where the story gets wild. While trying to get people to take the funds out of the paused and effected pools, a different vulnerability was found! The exploit from before dropped the rate; they found a rounding error to increase the rate. This exploits some decimal precision and rounding issues described above. When the rate is high, the BPT trades at a premium within the Composable Stable pool.
  • Raising the rate was fine because they couldn't get the rate back to profit from it; this was discussed during the design. However, the attacker found a way to drop the rate back down. During the initialization stage of a pool, the check is that the total supply is zero. By using the methods from above, this condition is possible to hit, recreating the initialization scenario. With this, the attacker could profit from the attack.
  • Trying to mitigate this was interesting: it's tough being a "fully decentralized" protocol. You want to be able to shut stuff off but you shouldn't be able to with fully decentralizion. Some items had a pause function, some were upgradable and a recovery mode. But, these weren't implemented in everything.
  • At the end of the article, the author reflects on many things. First, they had several audits done and this bug lurked for over 2 years without getting discovered by a whitehat. The complexity of the protocol became too much. In particular, the bootstrapping of functionality over and over again in strange ways. Overall, a fascinating postmortem about an immensely important protocol.

Understanding the Astrid Finance Exploit - 1277

Neptune Mutual    Reference →Posted 2 Years Ago
  • Astrid Finance is a liquid staking protocol built on top of the EigenLayer. Users deposit tokens to receive back liquid staking tokens. The earnings are compounded and distributed back to the stakers.
  • After depositing funds into the protocol, a user is able to call withdraw(). Sadly, the token contract was not validated for being used by the protocol. Instead, only the existence of this was checked.
  • Since these were not validated, an attacker was able to send in their own set of tokens for this that had no value in them. By using these fake tokens, the protocol assumed they were getting a good deal between them. In reality, the withdraw stole all stETH, rETH and cbETH from the protocol about around 228K.
  • With some drama, they blamed the auditor for recommending a bad fix to them. However, I didn't see the audit being public and this is so obvious the devs should know better.
  • They offered the attacker a 20% bounty if they returned the rest of the funds. This actually happened. Is this the precedent we want to set though? Hey! If you hack us, we will give you some percentage of the money and no pursue legal action? Not a good move to me. More blackhats may come out of this.

Aztec Connect Claim Proof Bug- 1276

Aztec    Reference →Posted 2 Years Ago
  • Aztec Connect is a privacy zkRollup blockchain used for DeFi. One of the novel features is the ability to send funds between the contracts to the L1 privately.
  • At a high level, here's how the protocol works:
    1. User submits a DeFi interaction deposit proof to the mempool. The proof keeps the identity private.
    2. The sequencer groups these interactions from the same protocol together then rolls them up to be sent.
    3. Once receiving the rollup proof, the smart contract on Ethereum will act on behalf of users to exchange and perform other operations.
    4. On the next rollup the sequencer completes special claim proofs, splitting the newly received tokens between users.
  • The program is written using zero knowledge circuits. Because this runs over a finite field and not integers, this makes trivial math complicated to perform. Aztec Connect uses TurboPlonk to create the connected gates.
  • How do these circuits work? Simple math properties over a finite field must be discussed first:
    • Addition: Add a value and wrap around if necessary.
    • Multiplication: Multiply a value and wrap around if necessary.
    • Negation: Finding a value that is 0 within the finite field. For instance, if the field is length 5 and the value I have is 2, then the negation would be 2+3.
    • Inverse: The element becomes 1 when multiplied by the original. 4*4=1 mod 5.
    • Subtraction: Add in the negated element.
    • Division: Multiplication by the inverse.
  • There are two main parts to a gate: selectors and witness values. Selectors are choose by the circuit writer to define the logic of the circuit. The witness values are the intermediate states of the circuit. w values are connected together within the circuit. The gate turns into the following:
    qm * wl * wr + q1 * wl + q2 * wr + q3 * w0 + qc = 0 mod p
    
  • If we wanted to show that y = 4x3 + 2, then we need two gates. First, wl * wr - w0 = 0 mod p and 4 * wl * wr - w0 + 2 = 0 mod p. All in all, we don't need to prove the computation - we need to prove the witness value.
  • The amount should be user_output = total_output * (user_input/total_input) for a given trade. The variable user_input will be floored in most cases. So, the circuit tries for find the division reminder to give to the user as well. While trying to do this, they divided up the number into limbs (sections?). By doing this, the 1 to 1 correspondence over module p was lost! This means that any multiple of p for a given value was valid.
  • On top of this, the constraints for the remainder did not exist. According to the authors of the post, this could have resulted in the sequencer to create proofs that would assign the depositor much less funds than they should receive.
  • Overall, a good description on ZK circuits (which I don't fully get yet) and on missing constraints causing problems. Math is hard enough when there's no millions of dollars at stake. Good find!