Compound Merkle Path Format

Deggen ([email protected]) Damian Orzepowski ([email protected])

Abstract

We propose a binary format for Compound Merkle Paths (CMP hereon) optimized for minimal data bandwidth during transmission.

Explainer Video

Watch Explainer

This BRC is licensed under the Open BSV license.

Motivation

Current format standards do not cover merkle paths for multiple txids within the same block. This would help reduce the overall size of data needed to express any set of paths from the same block. The larger the set the bigger the space saving.

Specification

For each level of the merkle path the opposite hash from the one which can be calculated is provided.

Data Types

Field
Description
Size

height

The height of the tree up to a max 64

1 byte

nLeaves

VarInt number of leaves at this height

1-9 bytes

offset

VarInt offset from left hand side within tree

1-9 bytes

leaf

Each leaf is a 32 byte hash

32 bytes

Formatting Syntax

  1. offset and leaf are repeated for nLeaves at each height

  2. height does not need to be repeated, the inference is that height starts as the max height of the tree and is decremented by one each time we reach the end of the current set of leaves. Once height === -1 we stop parsing.

  3. nLeaves is repeated for each height, followed by the corresponding offset and leaf for each.

Example

Important Note

We must include the txid and offset within a block at height 0. This is the most efficient way to store the index data required to pull out individual paths when given only a txid. In the example below we encode txids at indices 0 and 3:

index
txid

0

e86ec5732f55490a73677fe88a37c875cea49f572e4bc822b83fe96093bb008c

3

3b5a16dc41bbed3e58ad2a9017fb8954e7541975e2a4f37343761d96f431b3e5

By convention we reverse the bytes of a txid hex string so these sequences will be seen in their inverse endian form below.

Hex

Bytewise Breakdown

Implementation

Let's start by dumping this format as hex into a Buffer and parsing it into an object with a Buffer Reader. Then we construct an object

JSON Encoding of a Compound Merkle Path

If we JSON encode the leaves we get the following. Height is encoded as the position of the leaf object within the outermost array.

You can derive individual paths for particular transaction indices as necessary using the following algorithm:

Reading index 3 from the Compound Merkle Path.

Which yields:

Important to understand that we only kept the paths for indices 0 and 3 for this particular example. If you attempt to run the algo above for any other indices, an error would be thrown. This allowed us to keep 7 hashes out of 14 total. Keeping each path separately we'd have to keep minimum 6 hashes, and if we added another index then the compound method would only require one more hash, whereas saving individual paths would require another 3. The total bandwidth saving would be significant if we had thousands of transactions all in the same block.

Merkle Proof

We use this to prove inclusion in a block by running a Merkle Proof algorithm on the txid, index, and path. We arrive at a Merkle root hash. This can then be used as the key in a Block Header lookup to determine whether the txid is included within a block which is part of the longest chain of work.

Last updated

Was this helpful?