Darren Kellenschwiler (deggen@kschw.com), Deggen
Tone Engel (tone@kizmet.org), TonesNotes
Ty Everett (ty@projectbabbage.com)
Damian Orzepowski (damian.orzepowski@4chain.studio)
Arkadiusz Osowski (arkadiusz.osowski@4chain.studio)
Abstract
We propose the BSV Unified Merkle Path format in both binary and JSON encoding optimized for generation by transaction processors, and also happens to be convenient for proof validating clients.
At a high level the format encodes a number of txids which all exist within one particular block, along with each of their merkle paths and the blockHeight.
The blockHeight is encoded first, followed by level 0 of the Merkle tree, which includes the txids of interest, and their corresponding siblings. Thereafter we encode each level of the tree thereafter, but only include branches of the tree which are required to calculate the Merkle root the txids which are of interest to us. For example if we only have one txid of interest, we will include it and its sibling, followed by one leaf per level of the tree.
Copyright
This BRC is licensed under the Open BSV license.
Visualization
BUMP Showcase can help form an understanding of how BUMP works to encode all necessary data.
Motivation
Several formats have made their own improvements to the original format which was returned by a Bitcoin node via json-rpc method getmerkleproof.
Improvements include:
BRC-10 a TSC creation which was subsequently returned by the node's json-rpc method getmerkleproof2
BRC-11 removing the need for specifying targets, replacing with height to improve validation speed.
BRC-58 removal of all extraneous data to minimize data size.
BRC-61 introduction of a compound path encoding which allows representation of multiple paths within the same block.
The purpose of defining this new specification is to capture the incremental improvements under one spec which encapsulates the pros of each, and removes the cons. This new spec should allow:
Inclusion of height makes lookup extremely fast while only adding maximum 9 bytes to the data size.
Multiple paths can be expressed in the same data model.
One format for everything, so that there is no need to convert from single to compound path.
Size optimization allowing us to skip encoding of far right leaves when duplication of working hash would suffice.
Binary Encoding
Global
The top level encoding specifies a block height and a tree height.
Field
Description
Size
block height
VarInt block height in which the transactions are encapsulated
1-9 bytes
tree height
The height of the Merkle Tree in this block, max 64
1 byte
Level
Thereafter the number of leaves at the top height is specified, and the leaves for this height follow.
Field
Description
Size
nLeaves
VarInt number of leaves at this height
1-9 bytes
leaves
Each leaf encoded in the format below.
sum of leaf sizes
Leaf
Once all leaves at this height have been specified, an implied increment of the height in the tree occurs and we specify the number of leaves in the next level up, and so on until we have specified the leaves at level (treeHeight - 1) at which point we stop. We do not need to encode the root hash as it is always calculable.
Field
Description
Size
offset
VarInt offset from left hand side within tree
1-9 bytes
flags
Flags can be 00, or 01, or 02 - detailed meaning in table below
1 byte
hash
A hash representing a txid, sibling hash, or a branch
0 or 32 bytes
Flags
The first flag is to indicate whether or not to duplicate the working hash or use the following data. The second flag indicates whether the hash is a relevant txid or just a sibling hash.
fe8a6a0c00 // blockHeight (813706), VarInt0c // treeHeight (12), byte// Level 0, client TXIDs and sibling TXIDs (TXIDs required only to compute internal tree hash).04// nLeaves, VarIntfde80b // offset, VarInt00// flags11774f01d26412f0d16ea3f0447be0b5ebec67b0782e321a7a01cbdf7f734e30 // hashfde90b // offset VarInt02// flags = CLIENT_TXID004e53753e3fe4667073063a17987292cfdea278824e9888e52180581d7188d8 // hashfdea0b // offset VarInt02// flags = CLIENT_TXID5e441996fc53f0191d649e68a200e752fb5f39e0d5617083408fa179ddc5c998 // hashfdeb0b // offset VarInt01// flags = DUPLICATE_WORKING_HASH// Level 1, internal merkle tree hashes02// nLeaves, VarIntfdf405 // offset, VarInt00// flags0671394f72237d08a4277f4435e5b6edf7adc272f25effef27cdfe805ce71a81 // hashfdf505 // offset VarInt00// flags262bccabec6c4af3ed00cc7a7414edea9c5efa92fb8623dd6160a001450a5282 // hash// Level 2, internal merkle tree hashes01// nLeaves VarInt at level 2fdfb02 // offset VarInt01// flags = DUPLICATE_WORKING_HASH// Level 3, internal merkle tree hashes01// nLeaves VarInt at level 3fd7c01 // offset VarInt (three hundred and eighty)00// flags93b3efca9b77ddec914f8effac691ecb54e2c81d0ab81cbc4c4b93befe418e85 // hash// Level 4, internal merkle tree hashes01// nLeaves VarInt at level 4bf // offset VarInt01// flags = DUPLICATE_WORKING_HASH// Level 5, internal merkle tree hashes01// nLeaves VarInt at level 55e // offset VarInt00// flags5881826eb6973c54003a02118fe270f03d46d02681c8bc71cd44c613e86302f8 // hash// Level 6, internal merkle tree hashes01// nLeaves VarInt at level 62e // offset VarInt00// flagse07a2bb8bb75e5accff266022e1e5e6e7b4d6d943a04faadcf2ab4a22f796ff3 // hash// Level 7, internal merkle tree hashes01// nLeaves VarInt at level 716// offset VarInt00// flags8120cafa17309c0bb0e0ffce835286b3a2dcae48e4497ae2d2b7ced4f051507d // hash// Level 8, internal merkle tree hashes01// nLeaves VarInt at level 80a // offset VarInt00// flags502e59ac92f46543c23006bff855d96f5e648043f0fb87a7a5949e6a9bebae43 // hash// Level 9, internal merkle tree hashes01// nLeaves VarInt at level 904// offset VarInt00// flags1ccd9f8f64f4d0489b30cc815351cf425e0e78ad79a589350e4341ac165dbe45 // hash// Level 10, internal merkle tree hashes01// nLeaves VarInt at level 1003// offset VarInt01// flags = DUPLICATE_WORKING_HASH// Level 11, internal merkle tree hashes01// nLeaves VarInt at level 1100// offset VarInt00// flagsaf8764ce7e1cc132ab5ed2229a005c87201c9a5ee15c0f91dd53eff31ab30cd4 // hash
JSON Encoding
In the JSON encoding - we start with a height of a block in which transactions from BUMP are mined. A path Array index corresponds to the height within the Merkle tree, so we start with level 0 which includes all of the txid's of interest to a client in this block and the txid's of additional transactions required to begin the merkle root computation. Within each array element we contain an array of one or more leaves which are specified as a leaf.
Within the leaf itself we have am offset - the only required parameter, along with optional hash, txid and duplicate. The hash is a hex string encoding reversed bytes of the hash at this position in the Merkle tree, the duplicate true is a boolean and represents a "no data" for this position, this is to encode for the right hand side of the merkle tree. The expected behavior is for a parser to duplicate the working hash in this case, therefore no further data is required. A txid boolean is included if true - to indicate whether the hash in question is considered a relevant txid to the receiving party, rather than just a sibling hash needed to calculate the root.
Let's start by dumping this format as hex into a Buffer in JavaScript and parsing it into an object with a Buffer Reader. Then we can calculate the merkle root from any of the included txids.
const { createHash } =require('crypto')const { Br,Bw } =require('bsv')// Displaying hashes as hex strings in reverse byte order is a matter of convention with respect to txids. // The functions below handle the conversions such that when we "hash()" something, we are running sha256 - // digesting the reverse bytes of a hex string, and returning the reverse bytes encoded as a hex string.consthexRevToBuf= (str) =>Buffer.from(str,'hex').reverse()constbufRevToHex= (buf) =>Buffer.from(buf.toString('hex'),'hex').reverse().toString('hex')const hash = (str) => bufRevToHex(createHash('sha256').update(createHash('sha256').update(hexRevToBuf(str)).digest()).digest())
functionbumpHexToJSON(str) {constreader=newBr()reader.buf =Buffer.from(str,'hex')let blockHeight =reader.readVarIntNum()let treeHeight =parseInt(reader.read(1).toString('hex'),16)let path =Array(treeHeight).fill(0).map(() => ([]))let flags, offset, nLeavesAtThisHeightfor (let level =0; level < treeHeight; level++) { nLeavesAtThisHeight =reader.readVarIntNum()while (nLeavesAtThisHeight) { offset =reader.readVarIntNum() flags =parseInt(reader.read(1).toString('hex'),16)constleaf= { offset }if (flags &1) {leaf.duplicate =true } else {if (flags &2) leaf.txid =trueleaf.hash =reader.read(32).reverse().toString('hex') } path[level].push(leaf) nLeavesAtThisHeight-- } path[level].sort((a, b) =>a.offset -b.offset) }return { blockHeight, path }}functionbumpJSONtoHex({ blockHeight, path }) {constbw=newBw()bw.writeVarIntNum(blockHeight)let treeHeight =path.lengthbw.writeUInt8(treeHeight)for (let level =0; level < treeHeight; level++) {let nLeaves =Object.keys(path[level]).lengthbw.writeVarIntNum(nLeaves)for (constleafof path[level]) {bw.writeVarIntNum(leaf.offset)let flags =0if (!!leaf?.duplicate) flags |=1if (!!leaf?.txid) flags |=2bw.writeUInt8(flags)if ((flags &1) ===0)bw.write(hexRevToBuf(leaf.hash)) } }returnbw.toBuffer().toString('hex')}functioncalculateMerkleRootFromBUMP(bump, txid) {// Find the index of the txid at the lowest level of the Merkle treeconstindex=bump.path[0].find(l =>l.hash === txid).offsetif (!index) throwError(`The BUMP does not contain the txid: ${txid}`)// Calculate the root using the index as a way to determine which direction to concatenate.let workingHash = txidbump.path.map((leaves, height) => {constoffset= index >> height ^1constleaf=leaves.find(l =>l.offset === offset)if (!leaf) thrownewError(`We do not have a hash for this index at height: ${height}`)if (leaf.duplicate) { workingHash =hash(workingHash + workingHash) } elseif (offset %2) { workingHash =hash(leaf.hash + workingHash) } else { workingHash =hash(workingHash +leaf.hash) } })return workingHash}const bump = bumpHexToJSON('fe8a6a0c000c04fde80b0011774f01d26412f0d16ea3f0447be0b5ebec67b0782e321a7a01cbdf7f734e30fde90b02004e53753e3fe4667073063a17987292cfdea278824e9888e52180581d7188d8fdea0b025e441996fc53f0191d649e68a200e752fb5f39e0d5617083408fa179ddc5c998fdeb0b0102fdf405000671394f72237d08a4277f4435e5b6edf7adc272f25effef27cdfe805ce71a81fdf50500262bccabec6c4af3ed00cc7a7414edea9c5efa92fb8623dd6160a001450a528201fdfb020101fd7c010093b3efca9b77ddec914f8effac691ecb54e2c81d0ab81cbc4c4b93befe418e8501bf01015e005881826eb6973c54003a02118fe270f03d46d02681c8bc71cd44c613e86302f8012e00e07a2bb8bb75e5accff266022e1e5e6e7b4d6d943a04faadcf2ab4a22f796ff30116008120cafa17309c0bb0e0ffce835286b3a2dcae48e4497ae2d2b7ced4f051507d010a00502e59ac92f46543c23006bff855d96f5e648043f0fb87a7a5949e6a9bebae430104001ccd9f8f64f4d0489b30cc815351cf425e0e78ad79a589350e4341ac165dbe45010301010000af8764ce7e1cc132ab5ed2229a005c87201c9a5ee15c0f91dd53eff31ab30cd4')
calculateMerkleRootFromBUMP(bump, '304e737fdfcb017a1a322e78b067ecebb5e07b44f0a36ed1f01264d2014f7711') // '57aab6e6fb1b697174ffb64e062c4728f2ffd33ddcfa02a43b64d8cd29b483b4'
calculateMerkleRootFromBUMP(bump, 'd888711d588021e588984e8278a2decf927298173a06737066e43f3e75534e00') // '57aab6e6fb1b697174ffb64e062c4728f2ffd33ddcfa02a43b64d8cd29b483b4'
calculateMerkleRootFromBUMP(bump, '98c9c5dd79a18f40837061d5e0395ffb52e700a2689e641d19f053fc9619445e') // '57aab6e6fb1b697174ffb64e062c4728f2ffd33ddcfa02a43b64d8cd29b483b4'
bumpJSONtoHex(bump)
Merging
A note on compounding multiple BUMPs together. The first check should always be the blockHeight - ensure it matches. The second check is the root. Each BUMP calculates its root, and if they don't match - you cannot combine them. If they match then the process is a simple inclusion of all leaves, dropping duplicates.
functioncombinePaths(one, two) {if (one.blockHeight !==two.blockHeight) throwError('You cannot combine paths which do not have the same blockHeight.')consttxid1=one.path[0].find(leaf =>!!leaf?.hash).hashconstroot1=calculateMerkleRootFromBUMP(one, txid1)consttxid2=two.path[0].find(leaf =>!!leaf?.hash).hashconstroot2=calculateMerkleRootFromBUMP(two, txid2)if (root1 !== root2) throwError('You cannot combine paths which do not have the same root.')constcombinedPath= []for (let h =0; h <one.path.length; h++) {combinedPath.push([])for (let l =0; l <one.path[h].length; l++) { combinedPath[h].push(one.path[h][l]) }for (let l =0; l <two.path[h].length; l++) {if (!combinedPath[h].find(leaf =>leaf.offset ===two.path[h][l].offset)) { combinedPath[h].push(two.path[h][l]) } else {// Ensure that any elements which appear in both are not downgraded to a non txid.if (!!two.path[h][l]?.txid) combinedPath[h].find(leaf =>leaf.offset ===two.path[h][l]).txid =true } } }return { blockHeight:one.blockHeight, path: combinedPath }}