Bech32m format for v1+ witness addresses
This document defines an improved variant of Bech32 called , and amends BIP173 to use Bech32m for native segregated witness outputs of version 1 and later. Bech32 remains in use for segregated witness outputs of version 0.
No reviewsSpecification
BIP: 350 Layer: Applications Title: Bech32m format for v1+ witness addresses Authors: Pieter WuilleStatus: Deployed Type: Specification Assigned: 2020-12-16 License: BSD-2-Clause Discussion: 2021-01-05: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-January/018338.html [bitcoin-dev] Bech32m BIP: new checksum, and usage for segwit address Replaces: 173
Introduction
Abstract
This document defines an improved variant of Bech32 called Bech32m, and amends BIP173 to use Bech32m for native segregated witness outputs of version 1 and later. Bech32 remains in use for segregated witness outputs of version 0.
Copyright
This BIP is licensed under the 2-clause BSD license.
Motivation
BIP173 defined a generic checksummed base 32 encoded format called Bech32. It is in use for segregated witness outputs of version 0 (P2WPKH and P2WSH, see BIP141), and other applications.
Bech32 has an unexpected weakness: whenever the final character is a 'p', inserting or deleting any number of 'q' characters immediately preceding it does not invalidate the checksum. This does not affect existing uses of witness version 0 BIP173 addresses due to their restriction to two specific lengths, but may affect future uses and/or other applications using the Bech32 encoding.
This document addresses that by specifying Bech32m, a variant of Bech32 that mitigates this insertion weakness and related issues.
Specification
We first specify the new checksum algorithm, and then document how it should be used for future Bitcoin addresses.
Bech32m
Bech32m modifies the checksum of the Bech32 specification, replacing the constant 1 that is xored into the checksum at the end with 0x2bc830a3. The resulting checksum verification and creation algorithm (in Python, cf. the code in Bech32 section):
BECH32M_CONST = 0x2bc830a3def bech32m_polymod(values): GEN = [0x3b6a57b2, 0x26508e6d, 0x1ea119fa, 0x3d4233dd, 0x2a1462b3] chk = 1 for v in values: b = (chk >> 25) chk = (chk & 0x1ffffff) << 5 ^ v for i in range(5): chk ^= GEN[i] if ((b >> i) & 1) else 0 return chk
def bech32m_hrp_expand(s): return [ord(x) >> 5 for x in s] + [0] + [ord(x) & 31 for x in s]
def bech32m_verify_checksum(hrp, data): return bech32m_polymod(bech32m_hrp_expand(hrp) + data) == BECH32M_CONST
def bech32m_create_checksum(hrp, data): values = bech32m_hrp_expand(hrp) + data polymod = bech32m_polymod(values + [0,0,0,0,0,0]) ^ BECH32M_CONST return [(polymod >> 5 * (5 - i)) & 31 for i in range(6)]
All other aspects of Bech32 remain unchanged, including its human-readable parts (HRPs).
A combined function to decode both Bech32 and Bech32m simultaneously could be written using:
class Encoding(Enum):
BECH32 = 1
BECH32M = 2def bech32_bech32m_verify_checksum(hrp, data):
check = bech32_polymod(bech32_hrp_expand(hrp) + data)
if check == 1:
return Encoding.BECH32
elif check == BECH32M_CONST:
return Encoding.BECH32M
else:
return None
which returns either None for failure, or one of the BECH32 / BECH32M enumeration values to indicate successful decoding according to the respective standard.
Addresses for segregated witness outputs
Version 0 outputs (specifically, P2WPKH and P2WSH addresses) continue to use Bech32 as specified in BIP173. Addresses for segregated witness outputs version 1 through 16 use Bech32m. Again, all other aspects of the encoding remain the same, including the 'bc' HRP.
To generate an address for a segregated witness output:
- If its witness version is 0, encode it using Bech32.
- If its witness version is 1 or higher, encode it using Bech32m.
The following code demonstrates the checks that need to be performed. Refer to the Python code linked in the reference implementation section below for full details of the called functions.
def decode(hrp, addr):
hrpgot, data, spec = bech32_decode(addr)
if hrpgot != hrp:
return (None, None)
decoded = convertbits(data[1:], 5, 8, False)
# Witness programs are between 2 and 40 bytes in length.
if decoded is None or len(decoded) < 2 or len(decoded) > 40:
return (None, None)
# Witness versions are in range 0..16.
if data[0] > 16:
return (None, None)
# Witness v0 programs must be exactly length 20 or 32.
if data[0] == 0 and len(decoded) != 20 and len(decoded) != 32:
return (None, None)
# Witness v0 uses Bech32; v1 through v16 use Bech32m.
if data[0] == 0 and spec != Encoding.BECH32 or data[0] != 0 and spec != Encoding.BECH32M:
return (None, None)
# Success.
return (data[0], decoded)
Error locating
Bech32m, like Bech32, does support locating the positions of a few substitution errors. To combine this functionality with the segregated witness addresses proposed by this document, simply try locating errors for both Bech32 and Bech32m. If only one finds error locations, report that one. If both do (which should be very rare), there are a number of options:
- Report the one that needs fewer corrections (if they differ).
- Eliminate the response(s) that are inconsistent. Any symbol that isn't on an error location can be checked. For example, if the witness version symbol is not an error location, and it doesn't correspond to the specification used (0 for Bech32, 1+ for Bech32m), that response can be eliminated.
Compatibility
This document introduces a new encoding for v1 segregated witness outputs and higher versions. There should not be any compatibility issues on the receiver side; no wallets are creating v1 segregated witness addresses yet, as the output type is not usable on mainnet.
On the other hand, the Bech32m proposal breaks forward-compatibility for sending to v1 and higher version segregated witness addresses. This incompatibility is intentional. An alternative design was considered where Bech32 remained in use for certain subsets of future addresses, but ultimately discarded. By introducing a clean break, we protect not only new software but also existing senders from the mutation issue, as new addresses will be incompatible with the existing Bech32 address validation. Experiments by Taproot proponents had shown that hardly any wallets and services supported sending to higher segregated witness output versions, so little is lost by breaking forward-compatibility. Furthermore, those experiments identified cases in which segregated witness implementations would have caused wallets to burn funds when sending to version 1 addresses. In case it is still in use, the chosen approach will prevent such software from destroying funds when attempting to send to a Bech32m address.
Reference implementations
- Reference encoder and decoder:
- Fancy decoder that localizes errors:
Test vectors
Implementation advice Experiments testing BIP173 implementations found that many wallets and services did not support sending to higher version segregated witness outputs. In anticipation of the proposed Taproot soft fork introducing v1 segregated witness outputs on the network, we emphatically recommend employing the complete set of test vectors provided below as well as ensuring that your implementation supports sending to v1 and higher versions. All higher versions of native segregated witness outputs should be recognized as valid recipients. As higher versions are not defined on the network, no wallet should ever create them and no recipient should ever provide them to a sender. Nor should a recipient ever want to falsely provide them as the recipient would simply see a payment intended to themselves burned instead. However, by defining higher versions as valid recipients now, future soft forks introducing higher versions of native segwit outputs will be forward-compatible to all wallets correctly implementing the Bech32m specification.
Test vectors for Bech32m
The following strings are valid Bech32m:
- A1LQFN3A
- a1lqfn3a
- an83characterlonghumanreadablepartthatcontainsthetheexcludedcharactersbioandnumber11sg7hg6
- abcdef1l7aum6echk45nj3s0wdvt2fg8x9yrzpqzd3ryx
- 11llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllludsr8
- split1checkupstagehandshakeupstreamerranterredcaperredlc445v
- ?1v759aa
The following string are not valid Bech32m (with reason for invalidity):
- 0x20 + 1xj0phk: HRP character out of range
- 0x7F + 1g6xzxy: HRP character out of range
- 0x80 + 1vctc34: HRP character out of range
- an84characterslonghumanreadablepartthatcontainsthetheexcludedcharactersbioandnumber11d6pts4: overall max length exceeded
- qyrz8wqd2c9m: No separator character
- 1qyrz8wqd2c9m: Empty HRP
- y1b0jsk6g: Invalid data character
- lt1igcx5c0: Invalid data character
- in1muywd: Too short checksum
- mm1crxm3i: Invalid character in checksum
- au1s5cgom: Invalid character in checksum
- M1VUXWEZ: checksum calculated with uppercase form of HRP
- 16plkw9: empty HRP
- 1p2gdwpf: empty HRP
Test vectors for v0-v16 native segregated witness addresses
The following list gives valid segwit addresses and the scriptPubKey that they translate to in hex.
- BC1QW508D6QEJXTDG4Y5R3ZARVARY0C5XW7KV8F3T4: 0014751e76e8199196d454941c45d1b3a323f1433bd6
- tb1qrp33g0q5c5txsp9arysrx4k6zdkfs4nce4xj0gdcccefvpysxf3q0sl5k7: 00201863143c14c5166804bd19203356da136c985678cd4d27a1b8c6329604903262
- bc1pw508d6qejxtdg4y5r3zarvary0c5xw7kw508d6qejxtdg4y5r3zarvary0c5xw7kt5nd6y: 5128751e76e8199196d454941c45d1b3a323f1433bd6751e76e8199196d454941c45d1b3a323f1433bd6
- BC1SW50QGDZ25J: 6002751e
- bc1zw508d6qejxtdg4y5r3zarvaryvaxxpcs: 5210751e76e8199196d454941c45d1b3a323
- tb1qqqqqp399et2xygdj5xreqhjjvcmzhxw4aywxecjdzew6hylgvsesrxh6hy: 0020000000c4a5cad46221b2a187905e5266362b99d5e91c6ce24d165dab93e86433
- tb1pqqqqp399et2xygdj5xreqhjjvcmzhxw4aywxecjdzew6hylgvsesf3hn0c: 5120000000c4a5cad46221b2a187905e5266362b99d5e91c6ce24d165dab93e86433
- bc1p0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7vqzk5jj0: 512079be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798
- tc1p0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7vq5zuyut: Invalid human-readable part
- bc1p0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7vqh2y7hd: Invalid checksum (Bech32 instead of Bech32m)
- tb1z0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7vqglt7rf: Invalid checksum (Bech32 instead of Bech32m)
- BC1S0XLXVLHEMJA6C4DQV22UAPCTQUPFHLXM9H8Z3K2E72Q4K9HCZ7VQ54WELL: Invalid checksum (Bech32 instead of Bech32m)
- bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kemeawh: Invalid checksum (Bech32m instead of Bech32)
- tb1q0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7vq24jc47: Invalid checksum (Bech32m instead of Bech32)
- bc1p38j9r5y49hruaue7wxjce0updqjuyyx0kh56v8s25huc6995vvpql3jow4: Invalid character in checksum
- BC130XLXVLHEMJA6C4DQV22UAPCTQUPFHLXM9H8Z3K2E72Q4K9HCZ7VQ7ZWS8R: Invalid witness version
- bc1pw5dgrnzv: Invalid program length (1 byte)
- bc1p0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7v8n0nx0muaewav253zgeav: Invalid program length (41 bytes)
- BC1QR508D6QEJXTDG4Y5R3ZARVARYV98GJ9P: Invalid program length for witness version 0 (per BIP141)
- tb1p0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7vq47Zagq: Mixed case
- bc1p0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7v07qwwzcrf: zero padding of more than 4 bits
- tb1p0xlxvlhemja6c4dqv22uapctqupfhlxm9h8z3k2e72q4k9hcz7vpggkg4j: Non-zero padding in 8-to-5 conversion
- bc1gmk9yu: Empty data section
Appendix: checksum design & properties
Checksums are used to detect errors introduced into data during transfer. A hash function-based checksum such as Base58Check detects any type of error uniformly, but not all classes of errors are equally likely to occur in practice. Bech32 prioritizes detection of substitution errors, but improving detection of one error class inevitably wor
[Content truncated — view full spec at source]
Related Specs
Discussion (0 threads)
Loading discussions...