BSV Unified Merkle Path (BUMP) Format
Darren Kellenschwiler ([email protected]), Deggen Tone Engel ([email protected]), TonesNotes Ty Everett ([email protected]) Damian Orzepowski ([email protected]) Arkadiusz Osowski ([email protected])
Abstract
We propose the BSV Unified Merkle Path format in both binary and JSON encoding optimized for generation by transaction processors, and also happens to be convenient for proof validating clients.
At a high level the format encodes a number of txids which all exist within one particular block, along with each of their merkle paths and the blockHeight.
The blockHeight is encoded first, followed by level 0 of the Merkle tree, which includes the txids of interest, and their corresponding siblings. Thereafter we encode each level of the tree thereafter, but only include branches of the tree which are required to calculate the Merkle root the txids which are of interest to us. For example if we only have one txid of interest, we will include it and its sibling, followed by one leaf per level of the tree.
Copyright
This BRC is licensed under the Open BSV license.
Visualization
BUMP Showcase can help form an understanding of how BUMP works to encode all necessary data.
Motivation
Several formats have made their own improvements to the original format which was returned by a Bitcoin node via json-rpc method getmerkleproof.
Improvements include:
BRC-10 a TSC creation which was subsequently returned by the node's json-rpc method
getmerkleproof2BRC-11 removing the need for specifying targets, replacing with height to improve validation speed.
BRC-58 removal of all extraneous data to minimize data size.
BRC-61 introduction of a compound path encoding which allows representation of multiple paths within the same block.
The purpose of defining this new specification is to capture the incremental improvements under one spec which encapsulates the pros of each, and removes the cons. This new spec should allow:
Inclusion of height makes lookup extremely fast while only adding maximum 9 bytes to the data size.
Multiple paths can be expressed in the same data model.
One format for everything, so that there is no need to convert from single to compound path.
Size optimization allowing us to skip encoding of far right leaves when duplication of working hash would suffice.
Binary Encoding
Global
The top level encoding specifies a block height and a tree height.
block height
VarInt block height in which the transactions are encapsulated
1-9 bytes
tree height
The height of the Merkle Tree in this block, max 64
1 byte
Level
Thereafter the number of leaves at the top height is specified, and the leaves for this height follow.
nLeaves
VarInt number of leaves at this height
1-9 bytes
leaves
Each leaf encoded in the format below.
sum of leaf sizes
Leaf
Once all leaves at this height have been specified, an implied increment of the height in the tree occurs and we specify the number of leaves in the next level up, and so on until we have specified the leaves at level (treeHeight - 1) at which point we stop. We do not need to encode the root hash as it is always calculable.
offset
VarInt offset from left hand side within tree
1-9 bytes
flags
Flags can be 00, or 01, or 02 - detailed meaning in table below
1 byte
hash
A hash representing a txid, sibling hash, or a branch
0 or 32 bytes
Flags
The first flag is to indicate whether or not to duplicate the working hash or use the following data. The second flag indicates whether the hash is a relevant txid or just a sibling hash.
0000 0000
00
data follows, not a client txid
0000 0001
01
nothing follows, duplicate working hash
0000 0010
02
data follows, and is a client txid
Hex String
Bytewise Breakdown
JSON Encoding
In the JSON encoding - we start with a height of a block in which transactions from BUMP are mined. A path Array index corresponds to the height within the Merkle tree, so we start with level 0 which includes all of the txid's of interest to a client in this block and the txid's of additional transactions required to begin the merkle root computation. Within each array element we contain an array of one or more leaves which are specified as a leaf.
Within the leaf itself we have am offset - the only required parameter, along with optional hash, txid and duplicate. The hash is a hex string encoding reversed bytes of the hash at this position in the Merkle tree, the duplicate true is a boolean and represents a "no data" for this position, this is to encode for the right hand side of the merkle tree. The expected behavior is for a parser to duplicate the working hash in this case, therefore no further data is required. A txid boolean is included if true - to indicate whether the hash in question is considered a relevant txid to the receiving party, rather than just a sibling hash needed to calculate the root.
JSON Example
Calculating the Merkle Root from a BUMP
Let's start by dumping this format as hex into a Buffer in JavaScript and parsing it into an object with a Buffer Reader. Then we can calculate the merkle root from any of the included txids.
Merging
A note on compounding multiple BUMPs together. The first check should always be the blockHeight - ensure it matches. The second check is the root. Each BUMP calculates its root, and if they don't match - you cannot combine them. If they match then the process is a simple inclusion of all leaves, dropping duplicates.
Implementations
Last updated
Was this helpful?

