# SubTree Unified Merkle Path (STUMP) Format

Dylan Murray (<dylan.murray@bsvassociation.org>) Darren Kellenschwiler (<d.kellenschwiler@bsvassociation.org>) Siggi Oskarsson (<siggi.oskarsson@bsvassociation.org>)

## Abstract

We define the SubTree Unified Merkle Path (STUMP) format for encoding merkle paths within a subtree of a block, rather than across an entire block. A STUMP adapts the binary encoding specified in [BRC-74](https://bsv.brc.dev/transactions/0074) (BSV Unified Merkle Path / BUMP) for subtree scope, replacing the block height with a single-byte tree height header and an optional trailing block hash. The level and leaf encoding is identical to BRC-74. This enables scalable merkle proof generation and delivery in architectures where blocks are partitioned into subtrees for parallel processing.

## Copyright

This BRC is licensed under the Open BSV license.

## Motivation

As Bitcoin blocks scale to contain millions of transactions, computing and transmitting full-block merkle paths becomes increasingly expensive. Modern transaction processors such as Teranode partition each block's transactions into subtrees, each containing a bounded number of transaction IDs. Subtrees are announced, stored, and processed independently, enabling horizontal scaling of merkle proof infrastructure.

The challenge is delivering merkle proofs for registered transactions as quickly as possible after a block is mined. A service that processes subtrees independently needs a format to represent partial merkle paths within a single subtree. These subtree-scoped proofs can then be delivered via callbacks and later composed with the block-level merkle path (from subtree root to block merkle root) to produce a complete [BRC-74](https://bsv.brc.dev/transactions/0074) BUMP for SPV verification.

STUMP solves this by adapting the BRC-74 level and leaf encoding for subtree scope. The block height field present in BUMP is omitted — it is not meaningful at the subtree level and would waste bytes in every encoded STUMP. Instead, the header is a single byte: the tree height. An optional 32-byte block hash may be appended after the final level. The block hash is optional because STUMPs are typically built before the block hash is known, and can be associated with a block hash later via delivery metadata.

This design means:

* The level and leaf encoding is identical to BRC-74 — existing parsers need only handle the different header.
* STUMPs are self-delimiting, so multiple STUMPs can be concatenated safely.
* STUMPs can be merged using the same algorithm defined in BRC-74 for combining BUMPs.
* The header is a single byte (tree height). The optional block hash is appended at the end, requiring no flags or version bytes.

## Specification

### Relationship to BRC-74

A STUMP shares the same level and leaf encoding as [BRC-74](https://bsv.brc.dev/transactions/0074) BUMP, but differs in scope and header:

* The **block height** field present in BUMP is omitted. Block height is not meaningful at subtree scope and is conveyed separately via delivery metadata or composition with the block-level path.
* An optional 32-byte **block hash** may be appended after the final level. Because the level encoding is self-delimiting, a parser can detect its presence by checking for 32 remaining bytes after parsing all levels. The block hash is optional because STUMPs are typically constructed during subtree processing, before the block hash is known.
* The **tree height** reflects the height of the subtree's internal merkle tree, not the full block merkle tree.
* The **leaf offsets** at level 0 are positions within the subtree (0-indexed from the first transaction in the subtree), not positions within the full block.
* The **root** computed by walking the path is the subtree merkle root, not the block merkle root.

The level and leaf encoding (VarInt offsets, flag semantics, 32-byte hashes, duplicate handling) is identical to BRC-74.

### Binary Encoding

#### Header

| Field       | Description                                          | Size   |
| ----------- | ---------------------------------------------------- | ------ |
| tree height | Height of the subtree's internal merkle tree, max 64 | 1 byte |

#### Level

Repeated `treeHeight` times (level 0 through `treeHeight - 1`):

| Field   | Description                             | Size              |
| ------- | --------------------------------------- | ----------------- |
| nLeaves | `VarInt` number of leaves at this level | 1-9 bytes         |
| leaves  | Each leaf encoded in the format below   | sum of leaf sizes |

#### Leaf

| Field  | Description                                                                 | Size          |
| ------ | --------------------------------------------------------------------------- | ------------- |
| offset | `VarInt` offset from left hand side within the subtree's tree at this level | 1-9 bytes     |
| flags  | `00`, `01`, or `02` - detailed meaning in table below                       | 1 byte        |
| hash   | A hash representing a txid, sibling hash, or internal node                  | 0 or 32 bytes |

#### Flags

| bits      | byte | meaning                                 |
| --------- | ---- | --------------------------------------- |
| 0000 0000 | 00   | data follows, not a client txid         |
| 0000 0001 | 01   | nothing follows, duplicate working hash |
| 0000 0010 | 02   | data follows, and is a client txid      |

#### Trailing Block Hash (Optional)

| Field      | Description                                       | Size     |
| ---------- | ------------------------------------------------- | -------- |
| block hash | SHA256d hash of the block containing this subtree | 32 bytes |

After parsing all levels, if exactly 32 bytes remain in the buffer, they are the block hash. If 0 bytes remain, no block hash is present. Because the level and leaf encoding is self-delimiting (the tree height determines the number of levels, each level declares its leaf count, and each leaf's flags determine whether a 32-byte hash follows), a parser always knows the exact byte position where levels end.

### Tree Height Calculation

Given `n` leaf transactions in a subtree, the tree is padded to the next power of two. The tree height is the number of levels from the leaves to the root, exclusive of the root:

```
nextPowerOfTwo = smallest power of 2 >= n
treeHeight = log2(nextPowerOfTwo)
```

For example:

* 2 leaves: treeHeight = 1
* 3-4 leaves: treeHeight = 2
* 5-8 leaves: treeHeight = 3
* 9-16 leaves: treeHeight = 4
* 10 leaves (padded to 16): treeHeight = 4

The root hash (at level `treeHeight`) is never encoded, as it is always calculable from the path data.

### Duplicate Handling

When a subtree has a non-power-of-two number of transactions, leaf positions beyond the real transaction count are padded. During merkle tree construction, a node whose right child is a padding node duplicates its left child hash. In the STUMP encoding, these positions are represented with `flags = 0x01` (duplicate), and no hash data follows. A verifier encountering this flag duplicates the current working hash in place of reading a sibling.

### Building a STUMP

Given a subtree with `n` leaf transactions and a set of registered transaction indices:

1. Pad the leaf array to the next power of two and compute the full internal merkle tree using double-SHA256: `SHA256(SHA256(left || right))`.
2. Calculate `treeHeight` as described above.
3. For each level from 0 to `treeHeight - 1`:
   * At level 0, include each registered txid with `flags = 0x02` and its 32-byte hash.
   * At every level, include the sibling of each tracked index:
     * Sibling index = `index XOR 1`.
     * If the sibling is beyond the real node count at this level, mark it `flags = 0x01` (duplicate, no hash).
     * Otherwise, include the sibling's hash with `flags = 0x00`.
     * If the sibling is also a registered txid (at level 0), mark it `flags = 0x02`.
   * Advance tracked indices to the next level: `index = index / 2`.
   * Sort all leaves at this level by offset for deterministic output.

### Verifying a STUMP

Verification follows the standard BRC-74 algorithm, computing the subtree merkle root:

1. For each registered txid at level 0, set `workingHash = txid hash` and `offset = txid offset`.
2. For each level from 0 to `treeHeight - 1`:
   * Find the sibling at `offset XOR 1`.
   * If the sibling has `flags = 0x01`, set `siblingHash = workingHash` (duplicate).
   * Otherwise, `siblingHash` is the sibling's 32-byte hash.
   * If `offset` is even: `workingHash = SHA256(SHA256(workingHash || siblingHash))`.
   * If `offset` is odd: `workingHash = SHA256(SHA256(siblingHash || workingHash))`.
   * `offset = offset / 2`.
3. The final `workingHash` should equal the known subtree merkle root.

### Merging STUMPs

Multiple STUMPs from the same subtree (e.g., built for different sets of registered transactions) can be merged using the same algorithm as BRC-74 BUMP merging:

1. Verify that all STUMPs share the same `treeHeight` (and `blockHash`, if present).
2. For each level, collect all leaves from all STUMPs.
3. Deduplicate by offset. If the same offset appears in multiple STUMPs, prefer the leaf with `TxID = true` (flags `0x02`) over one without.
4. Sort the merged leaves by offset.

Because STUMPs are self-delimiting, multiple encoded STUMPs can also be safely concatenated into a single byte stream. A receiver can parse them sequentially, reading each STUMP's header and levels in order.

### Callback Delivery

In a typical deployment, STUMPs are delivered to registered callback URLs as part of a JSON payload:

```json
{
  "txids": ["aabbcc...", "ddeeff..."],
  "status": "MINED",
  "stumpData": "<base64-encoded STUMP binary>",
  "blockHash": "000000..."
}
```

The `stumpData` field contains the STUMP binary as a base64 string. When a callback carries multiple STUMPs (from different subtrees in the same block), the binary representations are concatenated before base64 encoding.

The `blockHash` in the JSON payload provides block context regardless of whether the STUMP itself includes a block hash in its header. When the block hash is known at STUMP construction time, it may be encoded in the STUMP header as well; when it is not yet known (e.g., during subtree processing before the block is finalized), the JSON payload provides it after the fact.

## Composing a Full BUMP

A STUMP proves that a transaction exists within a subtree. To produce a complete SPV proof linking the transaction to a block header, a receiver must compose the STUMP with an additional merkle path from the subtree root to the block merkle root. This composition yields a standard BRC-74 BUMP suitable for full SPV verification as described in [BRC-67](https://bsv.brc.dev/transactions/0067).

The composition process:

1. Verify the STUMP against the known subtree merkle root.
2. Obtain the merkle path from the subtree root to the block merkle root (e.g., from the block-level merkle tree or a coinbase BEEF envelope).
3. Concatenate the STUMP path levels with the subtree-to-root path levels, adjusting offsets appropriately for the subtree's position within the block.

## Hex Examples

### Without Block Hash

A STUMP with tree height 1, containing two registered txids that are each other's sibling:

#### Hex String

```
0102000211c6900eee6e68d191cd25034a5f872ed29e3b69273906a10e021f39ed8664710102baadf498a00ca5a44d1c4d9d103b49017f53cd8cb2a70a9c67fc884ecdd622b5
```

#### Bytewise Breakdown

```
01     // treeHeight (1), byte
// Level 0
02     // nLeaves (2), VarInt
00     // offset (0), VarInt
02     // leaf flags = CLIENT_TXID
11c6900eee6e68d191cd25034a5f872ed29e3b69273906a10e021f39ed866471 // 32-byte txid
01     // offset (1), VarInt
02     // leaf flags = CLIENT_TXID
baadf498a00ca5a44d1c4d9d103b49017f53cd8cb2a70a9c67fc884ecdd622b5 // 32-byte txid
// No trailing bytes — no block hash
```

### With Block Hash

The same STUMP with a block hash appended:

#### Hex String

```
0102000211c6900eee6e68d191cd25034a5f872ed29e3b69273906a10e021f39ed8664710102baadf498a00ca5a44d1c4d9d103b49017f53cd8cb2a70a9c67fc884ecdd622b59fd16010222eea8719743ed124d41b7cf7a555137647b0ac9b12114339b2fff9
```

#### Bytewise Breakdown

```
01     // treeHeight (1), byte
// Level 0
02     // nLeaves (2), VarInt
00     // offset (0), VarInt
02     // leaf flags = CLIENT_TXID
11c6900eee6e68d191cd25034a5f872ed29e3b69273906a10e021f39ed866471 // 32-byte txid
01     // offset (1), VarInt
02     // leaf flags = CLIENT_TXID
baadf498a00ca5a44d1c4d9d103b49017f53cd8cb2a70a9c67fc884ecdd622b5 // 32-byte txid
// Trailing 32 bytes — block hash
9fd16010222eea8719743ed124d41b7cf7a555137647b0ac9b12114339b2fff9 // 32-byte block hash
```

## Implementations

Golang - [merkle-service](https://github.com/bsv-blockchain/merkle-service) (`internal/stump` package)

## References

* [BRC-74: BSV Unified Merkle Path (BUMP) Format](https://bsv.brc.dev/transactions/0074)
* [BRC-67: Simplified Payment Verification](https://bsv.brc.dev/transactions/0067)
* [BRC-61: Compound Merkle Path Format](https://bsv.brc.dev/transactions/0061)
* [BRC-62: Background Evaluation Extended Format (BEEF) Transactions](https://bsv.brc.dev/transactions/0062)
