Bitcoin Script Assembly Language
Ty Everett (ty@projectbabbage.com)
Abstract
Bitcoin script is a programming language used in Bitcoin transactions to control the spending of coins. While the Bitcoin script opcodes are machine-readable, they are not easily human-readable. We define an assembly language that provides a human-readable format for expressing Bitcoin script opcodes, making it easier for developers to understand and write Bitcoin scripts. This specification defines the rules for expressing Bitcoin script opcodes in assembly format, including indentation, comments, expressing text strings and numbers, templating and other guidelines.
Motivation
Bitcoin script is a powerful tool for controlling the spending of coins, but its opcodes can be difficult for developers to read and understand. By providing a human-readable assembly format for expressing Bitcoin script opcodes, developers can more easily write and debug Bitcoin scripts. This specification aims to standardize the assembly format for Bitcoin script opcodes, making it easier for developers to work with Bitcoin transactions.
This specification is intended to start the discussion and propose an initial format for representing Bitcoin scripts in an assembly language. Iterative improvement through future BRCs will help improve this standard.
Specification
We specify an assembly language for Bitcoin script programs as follows:
Syntax
Opcodes should be expressed in uppercase and without
OP_
prefixing, e.g.CHECKSIG
.Data to push on the stack is represented by strings, hex values, or template variables.
Whitespace should be used for indentation and to separate tokens.
Comments should start with the
#
character and extend to the end of the line..unlock
and.lock
can be used to denote the boundary between an unlocking script and its corresponding lock.
For example:
Data types
Numeric values should be expressed as their corresponding opcodes, or in hexadecimal format.
String values should be enclosed in single quotes.
Hex values should be expressed in lowercase, without
0x
prefixing.Template variables are placed in angle brackets like
<hash>
For example:
Flow control
Conditional statements should use the IF
opcode, followed by the conditional expression and the ELSE
or ENDIF
opcodes.
For example:
File Extension
We specify that .basm
files can be used to represent Bitcoin assembly language programs.
Implementation
The process of assembling programs written in this assembly language comprises:
Obtaining the values for template variables, either programmatically or by seeking user input
Substituting template variable placeholders for the actual values
Removing comments
Computing and adding the correct PUSHDATA opcodes for adding string and hexadecimal values to the stack
Converting all string values to hexadecimal
Substituting all opcode names for their hexadecimal coded values
Removing
.unlock
and.lock
annotations if presentRemoving all whitespace to arrive at the fully-assembled program
Last updated