Bitcoin Script Assembly Language

Ty Everett (ty@projectbabbage.com)

Abstract

Bitcoin script is a programming language used in Bitcoin transactions to control the spending of coins. While the Bitcoin script opcodes are machine-readable, they are not easily human-readable. We define an assembly language that provides a human-readable format for expressing Bitcoin script opcodes, making it easier for developers to understand and write Bitcoin scripts. This specification defines the rules for expressing Bitcoin script opcodes in assembly format, including indentation, comments, expressing text strings and numbers, templating and other guidelines.

Motivation

Bitcoin script is a powerful tool for controlling the spending of coins, but its opcodes can be difficult for developers to read and understand. By providing a human-readable assembly format for expressing Bitcoin script opcodes, developers can more easily write and debug Bitcoin scripts. This specification aims to standardize the assembly format for Bitcoin script opcodes, making it easier for developers to work with Bitcoin transactions.

This specification is intended to start the discussion and propose an initial format for representing Bitcoin scripts in an assembly language. Iterative improvement through future BRCs will help improve this standard.

Specification

We specify an assembly language for Bitcoin script programs as follows:

Syntax

  • Opcodes should be expressed in uppercase and without OP_ prefixing, e.g. CHECKSIG.

  • Data to push on the stack is represented by strings, hex values, or template variables.

  • Whitespace should be used for indentation and to separate tokens.

  • Comments should start with the # character and extend to the end of the line.

  • .unlock and .lock can be used to denote the boundary between an unlocking script and its corresponding lock.

For example:

.unlock
  <sig> # The signature used to unlock the script
  <key> # The public key that unlocks the script

.lock
  DUP HASH160 # Duplicate the key and take its hash
  1a98d1ea5702a518b8c4ad9bb736bf34fa9e7291 EQUALVERIFY # Check the hashes are equal
  CHECKSIG # Check that the signature from this key is valid

Data types

  • Numeric values should be expressed as their corresponding opcodes, or in hexadecimal format.

  • String values should be enclosed in single quotes.

  • Hex values should be expressed in lowercase, without 0x prefixing.

  • Template variables are placed in angle brackets like <hash>

For example:

OVER 3 SPLIT NIP TRUE SPLIT SWAP SPLIT DROP HASH160 <hash> EQUALVERIFY CHECKSIG

Flow control

Conditional statements should use the IF opcode, followed by the conditional expression and the ELSE or ENDIF opcodes.

For example:

2 3 ADD 5 EQUAL IF # if 2 + 3 = 5
  'yes' RETURN     # Return 'yes'
ELSE               # Otherwise
  'no' RETURN      # Return 'no'
ENDIF              # End

File Extension

We specify that .basm files can be used to represent Bitcoin assembly language programs.

Implementation

The process of assembling programs written in this assembly language comprises:

  • Obtaining the values for template variables, either programmatically or by seeking user input

  • Substituting template variable placeholders for the actual values

  • Removing comments

  • Computing and adding the correct PUSHDATA opcodes for adding string and hexadecimal values to the stack

  • Converting all string values to hexadecimal

  • Substituting all opcode names for their hexadecimal coded values

  • Removing .unlock and .lock annotations if present

  • Removing all whitespace to arrive at the fully-assembled program

Last updated