About Project Blog Resources

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Weaknesses in Bitcoin’s Merkle Root Construction- 1474

Linux Foundation Posted 1 Year Ago

The Merkle root construction of Bitcoin takes a double SHA256 hash of each transaction as a leaf. Then, each hash is concatenated and hashed until we reach the top. If there are an odd number of leaves at a level in the tree then the final element is duplicated.
CVE-2012-2459 exposes an issue with the odd number part of this. This is best done by example: let's take a tree with [1,2,3,4,5,6]. After [5,6] is hashed, it will need to be duplicated then hashed. Functionality, this is the same as [1,2,3,4,5,6,5,6]! There is a really good image of this in the paper.
Another issue is the lack of domain separators between nodes and leaves. A non-leaf node in the Merkle tree is a hash of a 64 byte input, which is the same as two hashes of its children. Meaningwhile, a leaf node is the has of a transaction. If the serialization of the transaction is 64 bytes, it can be impossible to distinguish between a leaf node and a non-leaf node.
With this knowledge, it begs the question: how much work is required to produce a 2 transaction block such that the merkle root is the hash of a 64 byte input? After showing which bytes are controlled and which are not, this requires about 8 bits of work in the first 32 bytes and 22 bits of work on the final 32 bytes, which is doable!
An SPV proof provides the client with a path through the Merkle tree; from the root down to the transaction. The SPV client also does not know how many TXs are in the block. So, if we can construct a transaction such that the second 32 bytes collide with the hash of a fake transaction, we can include this fake TX in the block.
Although this seems crazy initially, it's not. It's reasonable to have the second 32 bytes of a TX (with much control over it) be the output with a pre-calculated hash that was made. As a result, the TX we WANT to include will be included unintentionally. This requires 81 bits of work, which is much less than the 128 bits expected.

Maxwell Dulin

About Project Blog Resources

Resources People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Weaknesses in Bitcoin’s Merkle Root Construction- 1474

Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!