Restore disabled script opcodes, May 2018
Introduction
In 2010 and 2011 the discovery of serious bugs prompted the deactivation of many opcodes in the Bitcoin script language. It is our intention to restore the functionality that some of these opcodes provided in Bitcoin Cash. Rather than simply re-enable the opcodes, the functionality that they provide has been re-examined and in some cases the opcodes have been re-designed or new opcodes have been added to address specific issues.
This document contains the specifications for the opcodes that are to be added in the May 2018 protocol upgrade. We anticipate that additional opcodes will be proposed for the November 2018, or later, protocol upgrades.
The opcodes that are to be added are:
Word | OpCode | Hex | Input | Output | Description |
---|---|---|---|---|---|
OP_CAT | 126 | 0x7e | x1 x2 | out | Concatenates two byte sequences |
OP_SPLIT | 127 | 0x7f | x n | x1 x2 | Split byte sequence x at position n |
OP_AND | 132 | 0x84 | x1 x2 | out | Boolean AND between each bit of the inputs |
OP_OR | 133 | 0x85 | x1 x2 | out | Boolean OR between each bit of the inputs |
OP_XOR | 134 | 0x86 | x1 x2 | out | Boolean EXCLUSIVE OR between each bit of the inputs |
OP_DIV | 150 | 0x96 | a b | out | a is divided by b |
OP_MOD | 151 | 0x97 | a b | out | return the remainder after a is divided by b |
OP_NUM2BIN | 128 | 0x80 | a b | out | convert numeric value a into byte sequence of length b |
OP_BIN2NUM | 129 | 0x81 | x | out | convert byte sequence x into a numeric value |
Splice operations: OP_CAT
, OP_SPLIT
**
Bitwise logic: OP_AND
, OP_OR
, OP_XOR
Arithmetic: OP_DIV
, OP_MOD
New operations:
x OP_BIN2NUM -> n
, convert a byte sequencex
into a numeric valuen m OP_NUM2BIN -> out
, convert a numeric valuen
into a byte sequence of lengthm
Further discussion of the purpose of these new operations can be found below under bitwise operations.
** A new operation, OP_SPLIT
, has been designed as a replacement for OP_SUBSTR
, OP_LEFT
and OP_RIGHT
.
The original operations can be implemented with varying combinations of OP_SPLIT
, OP_SWAP
and OP_DROP
.
Script data types
It should be noted that in script operation data values on the stack are interpreted as either byte sequences or numeric values. All data on the stack is interpreted as a byte sequence unless specifically stated as being interpreted as a numeric value.
For accuracy in this specification, a byte sequences is presented as {0x01, 0x02, 0x03}. This sequence is three bytes long, it begins with a byte of value 1 and ends with a byte of value 3.
The numeric value type has specific limitations:
- The used encoding is little endian with an explicit sign bit (the highest bit of the last byte).
- They cannot exceed 4 bytes in length.
- They must be encoded using the shortest possible byte length (no zero padding)
- There is one exception to rule 3: if there is more than one byte and the most significant bit of the second-most-significant-byte is set it would conflict with the sign bit. In this case a single 0x00 or 0x80 byte is allowed to the left.
- Zero is encoded as a zero length byte sequence. Single byte positive or negative zero (0x00 or 0x80) are not allowed.
The new opcode x OP_BIN2NUM -> out
can be used convert a byte sequence into a numeric value where required.
The new opcode x n OP_NUM2BIN
can be used to convert a numeric value into a zero padded byte sequence of length n
whilst preserving the sign bit.
Definitions
- Stack memory use. This is the sum of the size of the elements on the stack. It gives an indication of impact on memory use by the interpreter.
- Operand order. In keeping with convention where multiple operands are specified the top most stack item is the
last operand. e.g.
x1 x2 OP_CAT
-->x2
is the top stack item andx1
is the next from the top. - empty byte sequence. Throughout this document
OP_0
is used as a convenient representation of an empty byte sequence. Whilst it is a push data op code, its effect is to push an empty byte sequence on to the stack.
Specification
Global conditions apply to all operations. These conditions must be checked by the implementation when it is possible that they will occur:
- for all e : elements on the stack,
0 <= len(e) <= MAX_SCRIPT_ELEMENT_SIZE
- for each operator, the required number of operands are present on the stack when the operand is executed
These unit tests should be included for every operation:
- executing the operation with an input element of length greater than
MAX_SCRIPT_ELEMENT_SIZE
will fail - executing the operation with an insufficient number of operands on the stack causes a failure
Operand consumption:
In all cases where not explicitly stated otherwise the operand stack elements are consumed by the operation and replaced with the output.
Splice operations
OP_CAT
Opcode (decimal): 126
Opcode (hex): 0x7e
Concatenates two operands.
x1 x2 OP_CAT -> out
Examples:
{Ox11} {0x22, 0x33} OP_CAT -> 0x112233
The operator must fail if len(out) > MAX_SCRIPT_ELEMENT_SIZE
. The operation cannot output elements that violate the constraint on the element size.
Note that the concatenation of a zero length operand is valid.
Impact of successful execution:
- stack memory use is constant
- number of elements on stack is reduced by one
The limit on the length of the output prevents the memory exhaustion attack and results in the operation having less impact on stack size than existing OP_DUP operators.
Unit tests:
maxlen_x y OP_CAT -> failure
. Concatenating any operand except an empty vector, including a single byte value (e.g.OP_1
), onto a maximum sized array causes failurelarge_x large_y OP_CAT -> failure
. Concatenating two operands, where the total length is greater thanMAX_SCRIPT_ELEMENT_SIZE
, causes failureOP_0 OP_0 OP_CAT -> OP_0
. Concatenating two empty arrays results in an empty arrayx OP_0 OP_CAT -> x
. Concatenating an empty array onto any operand results in the operand, including whenlen(x) = MAX_SCRIPT_ELEMENT_SIZE
OP_0 x OP_CAT -> x
. Concatenating any operand onto an empty array results in the operand, including whenlen(x) = MAX_SCRIPT_ELEMENT_SIZE
x y OP_CAT -> concat(x,y)
. Concatenating two operands generates the correct result
OP_SPLIT
OP_SPLIT
replaces OP_SUBSTR
and uses it's opcode.
Opcode (decimal): 127
Opcode (hex): 0x7f
Split the operand at the given position. This operation is the exact inverse of OP_CAT
x n OP_SPLIT -> x1 x2
where n is interpreted as a numeric value
Examples:
{0x00, 0x11, 0x22} 0 OP_SPLIT -> OP_0 {0x00, 0x11, 0x22}
{0x00, 0x11, 0x22} 1 OP_SPLIT -> {0x00} {0x11, 0x22}
{0x00, 0x11, 0x22} 2 OP_SPLIT -> {0x00, 0x11} {0x22}
{0x00, 0x11, 0x22} 3 OP_SPLIT -> {0x00, 0x11, 0x22} OP_0
Notes:
- this operator has been introduced as a replacement for the previous
OP_SUBSTR
,OP_LEFT
andOP_RIGHT
. All three operators can be
simulated with varying combinations of OP_SPLIT
, OP_SWAP
and OP_DROP
. This is in keeping with the minimalist philosophy where a single
primitive can be used to simulate multiple more complex operations.
x
is split at positionn
, wheren
is the number of bytes from the beginningx1
will be the firstn
bytes ofx
andx2
will be the remaining bytes- if
n == 0
, thenx1
is the empty array andx2 == x
- if
n == len(x)
thenx1 == x
andx2
is the empty array. - if
n > len(x)
, then the operator must fail. x n OP_SPLIT OP_CAT -> x
, for allx
and for all0 <= n <= len(x)
The operator must fail if:
!isnum(n)
. Fail ifn
is not a numeric value.n < 0
. Fail ifn
is negative.n > len(x)
. Fail ifn
is too high.
Impact of successful execution:
- stack memory use is constant (slight reduction by
len(n)
) - number of elements on stack is constant
Unit tests:
OP_0 0 OP_SPLIT -> OP_0 OP_0
. Execution of OP_SPLIT on empty array results in two empty arrays.x 0 OP_SPLIT -> OP_0 x
x len(x) OP_SPLIT -> x OP_0
x (len(x) + 1) OP_SPLIT -> FAIL
- include successful unit tests
Bitwise logic
The bitwise logic operators expect 'byte sequence' operands. The operands must be the same length.
- In the case of 'byte sequence' operands
OP_CAT
can be used to pad a shorter byte sequence to an appropriate length. - In the case of 'byte sequence' operands where the length of operands is not known until runtime a sequence of 0x00 bytes
(for use with
OP_CAT
) can be produced usingOP_0 n OP_NUM2BIN
- In the case of numeric value operands
x n OP_NUM2BIN
can be used to pad a numeric value to lengthn
whilst preserving the sign bit.
OP_AND
Opcode (decimal): 132
Opcode (hex): 0x84
Boolean and between each bit in the operands.
x1 x2 OP_AND -> out
Notes:
- where
len(x1) == 0
andlen(x2) == 0
the output will be an empty array.
The operator must fail if:
len(x1) != len(x2)
. The two operands must be the same size.
Impact of successful execution:
- stack memory use reduced by
len(x1)
- number of elements on stack is reduced by one
Unit tests:
x1 x2 OP_AND -> failure
, wherelen(x1) != len(x2)
. The two operands must be the same size.x1 x2 OP_AND -> x1 & x2
. Check valid results.
OP_OR
Opcode (decimal): 133
Opcode (hex): 0x85
Boolean or between each bit in the operands.
x1 x2 OP_OR -> out
The operator must fail if:
len(x1) != len(x2)
. The two operands must be the same size.
Impact of successful execution:
- stack memory use reduced by
len(x1)
- number of elements on stack is reduced by one
Unit tests:
x1 x2 OP_OR -> failure
, wherelen(x1) != len(x2)
. The two operands must be the same size.x1 x2 OP_OR -> x1 | x2
. Check valid results.
OP_XOR
Opcode (decimal): 134
Opcode (hex): 0x86
Boolean xor between each bit in the operands.
x1 x2 OP_XOR -> out
The operator must fail if:
len(x1) != len(x2)
. The two operands must be the same size.
Impact of successful execution:
- stack memory use reduced by
len(x1)
- number of elements on stack is reduced by one
Unit tests:
x1 x2 OP_XOR -> failure
, wherelen(x1) != len(x2)
. The two operands must be the same size.x1 x2 OP_XOR -> x1 xor x2
. Check valid results.
Arithmetic
Note about canonical form and floor division
Operands for all arithmetic operations are assumed to be numeric values and must be in canonical form. See data types for more information.
Floor division
Note that when considering integer division and modulo operations with negative operands, the rules applied in the C language and most languages (with Python being a notable exception) differ from the strict mathematical definition. Script follows the C language set of rules. Namely:
- Non-integer quotients are rounded towards zero
- The equation
(a/b)*b + a%b == a
is satisfied by the results - From the above equation it follows that:
a%b == a - (a/b)*b
- In practice if
a
is negative for the modulo operator the result will be negative or zero.
OP_DIV
Opcode (decimal): 150
Opcode (hex): 0x96
Return the integer quotient of a
and b
. If the result would be a non-integer it is rounded towards zero.
a b OP_DIV -> out
where a and b are interpreted as numeric values
The operator must fail if:
!isnum(a) || !isnum(b)
. Fail if either operand is not a numeric value.b == 0
. Fail ifb
is equal to any type of zero.
Impact of successful execution:
- stack memory use reduced
- number of elements on stack is reduced by one
Unit tests:
a b OP_DIV -> failure
where!isnum(a)
or!isnum(b)
. Both operands must be numeric valuesa 0 OP_DIV -> failure
. Division by positive zero (all sizes), negative zero (all sizes),OP_0
27 7 OP_DIV -> 3
,27 -7 OP_DIV -> -3
,-27 7 OP_DIV -> -3
,-27 -7 OP_DIV -> 3
. Check negative operands.
Pay attention to sign.
4. check valid results for operands of different lengths 0..4
OP_MOD
Opcode (decimal): 151
Opcode (hex): 0x97
Returns the remainder after dividing a by b. The output will be represented using the least number of bytes required.
a b OP_MOD -> out
where a and b are interpreted as numeric values
The operator must fail if:
!isnum(a) || !isnum(b)
. Fail if either operand is not a numeric value.b == 0
. Fail ifb
is equal to any type of zero.
Impact of successful execution:
- stack memory use reduced (one element removed)
- number of elements on stack is reduced by one
Unit tests:
a b OP_MOD -> failure
where!isnum(a)
or!isnum(b)
. Both operands must be numeric values.a 0 OP_MOD -> failure
. Division by positive zero (all sizes), negative zero (all sizes),OP_0
27 7 OP_MOD -> 6
,27 -7 OP_MOD -> 6
,-27 7 OP_MOD -> -6
,-27 -7 OP_MOD -> -6
. Check negative operands.
Pay attention to sign.
4. check valid results for operands of different lengths 0..4
and returning result zero
New operations
OP_NUM2BIN
OP_NUM2BIN
replaces OP_LEFT
and uses it's opcode
Opcode (decimal): 128
Opcode (hex): 0x80
Convert the numeric value into a byte sequence of a certain size, taking account of the sign bit. The byte sequence produced uses the little-endian encoding.
a b OP_NUM2BIN -> x
where a
and b
are interpreted as numeric values. a
is the value to be converted to a byte sequence,
it can be up to MAX_SCRIPT_ELEMENT_SIZE
long and does not need to be minimally encoded.
b
is the desired size of the result, it must be minimally encoded and <= 4 bytes long. It must be possible for the
value a
to be encoded in a byte sequence of length b
without loss of data.
See also OP_BIN2NUM
.
Examples:
2 4 OP_NUM2BIN -> {0x02, 0x00, 0x00, 0x00}
-5 4 OP_NUM2BIN -> {0x05, 0x00, 0x00, 0x80}
The operator must fail if:
b
is not a minimally encoded numeric value.b < len(minimal_encoding(a))
.a
must be able to fit intob
bytes.b > MAX_SCRIPT_ELEMENT_SIZE
. The result would be too large.
Impact of successful execution:
- stack memory use will be increased by
b - len(a) - len(b)
, maximum increase is whenb = MAX_SCRIPT_ELEMENT_SIZE
- number of elements on stack is reduced by one
Unit tests:
a b OP_NUM2BIN -> failure
where!isnum(b)
.b
must be a minimally encoded numeric value.256 1 OP_NUM2BIN -> failure
. Trying to produce a byte sequence which is smaller than the minimum size needed to contain the numeric value.1 (MAX_SCRIPT_ELEMENT_SIZE+1) OP_NUM2BIN -> failure
. Trying to produce an array which is too large.- other valid parameters with various results
OP_BIN2NUM
OP_BIN2NUM
replaces OP_RIGHT
and uses it's opcode
Opcode (decimal): 129
Opcode (hex): 0x81
Convert the byte sequence into a numeric value, including minimal encoding. The byte sequence must encode the value in little-endian encoding.
a OP_BIN2NUM -> x
See also OP_NUM2BIN
.
Notes:
- if
a
is any form of zero, including negative zero, thenOP_0
must be the result
Examples:
{0x02, 0x00, 0x00, 0x00, 0x00} OP_BIN2NUM -> 2
.0x0200000000
in little-endian encoding has value 2.{0x05, 0x00, 0x80} OP_BIN2NUM -> -5
-0x050080
in little-endian encoding has value -5.
The operator must fail if:
- the numeric value is out of the range of acceptable numeric values (currently size is limited to 4 bytes)
Impact of successful execution:
- stack memory use is equal or less than before. Minimal encoding of the byte sequence can produce a result which is shorter.
- the number of elements on the stack remains constant
Unit tests:
a OP_BIN2NUM -> failure
, whena
is a byte sequence whose numeric value is too large to fit into the numeric value type, for both positive and negative values.{0x00} OP_BIN2NUM -> OP_0
. Byte sequences, of various lengths, consisting only of zeros should produce an OP_0 (zero length array).{0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} OP_BIN2NUM -> 1
. A large byte sequence, whose numeric value would fit in the numeric value type, is a valid operand.- The same test as above, where the length of the input byte sequence is equal to MAX_SCRIPT_ELEMENT_SIZE.
{0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80} OP_BIN2NUM -> -1
. Same as above, for negative values.{0x80} OP_BIN2NUM -> OP_0
. Negative zero, in a byte sequence, should produce zero.{0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80} OP_BIN2NUM -> OP_0
. Large negative zero, in a byte sequence, should produce zero.- other valid parameters with various results
Reference implementation
-
OP_AND, OP_OR, OP_XOR: https://reviews.bitcoinabc.org/D1211
-
OP_DIV and OP_MOD: https://reviews.bitcoinabc.org/D1212
-
OP_SPLIT: https://reviews.bitcoinabc.org/D1228
-
OP_BIN2NUM: https://reviews.bitcoinabc.org/D1220
-
OP_NUM2BIN: https://reviews.bitcoinabc.org/D1222