← Back to Bitcoin Improvement Proposals
BIP 173informationalDeployedaddresseskey-managementscript

Base32 address format for native v0-16 witness outputs

This document proposes a checksummed base32 format, "Bech32", and a standard for native segregated witness output addresses using it.

No reviews
Pieter Wuille·Updated Mar 29, 2026·0 reviews·0 attestations·View source
Collections:BIPs — Merged

Specification

  BIP: 173
  Layer: Applications
  Title: Base32 address format for native v0-16 witness outputs
  Authors: Pieter Wuille 
           Greg Maxwell 
  Comments-Summary: No comments yet.
  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0173
  Status: Deployed
  Type: Informational
  Assigned: 2017-03-20
  License: BSD-2-Clause
  Replaces: 142
  Proposed-Replacement: 350

Introduction

Abstract

This document proposes a checksummed base32 format, "Bech32", and a standard for native segregated witness output addresses using it.

Copyright

This BIP is licensed under the 2-clause BSD license.

Motivation

For most of its history, Bitcoin has relied on base58 addresses with a truncated double-SHA256 checksum. They were part of the original software and their scope was extended in BIP13 for Pay-to-script-hash (P2SH). However, both the character set and the checksum algorithm have limitations:

  • Base58 needs a lot of space in QR codes, as it cannot use the alphanumeric mode.
  • The mixed case in base58 makes it inconvenient to reliably write down, type on mobile keyboards, or read out loud.
  • The double SHA256 checksum is slow and has no error-detection guarantees.
  • Most of the research on error-detecting codes only applies to character-set sizes that are a prime power, which 58 is not.
  • Base58 decoding is complicated and relatively slow.
Included in the Segregated Witness proposal are a new class of outputs (witness programs, see BIP141), and two instances of it ("P2WPKH" and "P2WSH", see BIP143). Their functionality is available indirectly to older clients by embedding in P2SH outputs, but for optimal efficiency and security it is best to use it directly. In this document we propose a new address format for native witness outputs (current and future versions).

This replaces BIP142, and was previously discussed here (summarized here).

Examples

All examples use public key 0279BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798. The P2WSH examples use key OP_CHECKSIG as script.

  • Mainnet P2WPKH: bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4
  • Testnet P2WPKH: tb1qw508d6qejxtdg4y5r3zarvary0c5xw7kxpjzsx
  • Mainnet P2WSH: bc1qrp33g0q5c5txsp9arysrx4k6zdkfs4nce4xj0gdcccefvpysxf3qccfmv3
  • Testnet P2WSH: tb1qrp33g0q5c5txsp9arysrx4k6zdkfs4nce4xj0gdcccefvpysxf3q0sl5k7

Specification

We first describe the general checksummed base32 format called Bech32 and then define Segregated Witness addresses using it.

Bech32

A Bech32 string is at most 90 characters long and consists of:

  • The human-readable part, which is intended to convey the type of data, or anything else that is relevant to the reader. This part MUST contain 1 to 83 US-ASCII characters, with each character having a value in the range [33-126]. HRP validity may be further restricted by specific applications.
  • The separator, which is always "1". In case "1" is allowed inside the human-readable part, the last one in the string is the separator.
    • The data part, which is at least 6 characters long and only consists of alphanumeric characters excluding "1", "b", "i", and "o".

      01234567+0+8+16+24
      qpzry9x8
      gf2tvdw0
      s3jn54kh
      ce6mua7l

      Checksum

      The last six characters of the data part form a checksum and contain no information. Valid strings MUST pass the criteria for validity specified by the Python3 code snippet below. The function bech32_verify_checksum must return true when its arguments are:

      • hrp: the human-readable part as a string
      • data: the data part as a list of integers representing the characters after conversion using the table above
      def bech32_polymod(values):
        GEN = [0x3b6a57b2, 0x26508e6d, 0x1ea119fa, 0x3d4233dd, 0x2a1462b3]
        chk = 1
        for v in values:
          b = (chk >> 25)
          chk = (chk & 0x1ffffff) << 5 ^ v
          for i in range(5):
            chk ^= GEN[i] if ((b >> i) & 1) else 0
        return chk

      def bech32_hrp_expand(s): return [ord(x) >> 5 for x in s] + [0] + [ord(x) & 31 for x in s]

      def bech32_verify_checksum(hrp, data): return bech32_polymod(bech32_hrp_expand(hrp) + data) == 1

      This implements a BCH code that guarantees detection of any error affecting at most 4 characters and has less than a 1 in 109 chance of failing to detect more errors. More details about the properties can be found in the Checksum Design appendix. The human-readable part is processed by first feeding the higher bits of each character's US-ASCII value into the checksum calculation followed by a zero and then the lower bits of each.

      To construct a valid checksum given the human-readable part and (non-checksum) values of the data-part characters, the code below can be used:

      def bech32_create_checksum(hrp, data):
        values = bech32_hrp_expand(hrp) + data
        polymod = bech32_polymod(values + [0,0,0,0,0,0]) ^ 1
        return [(polymod >> 5 * (5 - i)) & 31 for i in range(6)]
      

      Error correction

      One of the properties of these BCH codes is that they can be used for error correction. An unfortunate side effect of error correction is that it erodes error detection: correction changes invalid inputs into valid inputs, but if more than a few errors were made then the valid input may not be the correct input. Use of an incorrect but valid input can cause funds to be lost irrecoverably. Because of this, implementations SHOULD NOT implement correction beyond potentially suggesting to the user where in the string an error might be found, without suggesting the correction to make.

      Uppercase/lowercase

      The lowercase form is used when determining a character's value for checksum purposes.

      Encoders MUST always output an all lowercase Bech32 string. If an uppercase version of the encoding result is desired, (e.g.- for presentation purposes, or QR code use), then an uppercasing procedure can be performed external to the encoding process.

      Decoders MUST NOT accept strings where some characters are uppercase and some are lowercase (such strings are referred to as mixed case strings).

      For presentation, lowercase is usually preferable, but inside QR codes uppercase SHOULD be used, as those permit the use of alphanumeric mode, which is 45% more compact than the normal byte mode.

      Segwit address format

      A segwit address is a Bech32 encoding of:

      • The human-readable part "bc" for mainnet, and "tb" for testnet.
        • The data-part values:
        ** 1 character (representing 5 bits of data): the witness version ** A conversion of the 2-to-40-byte witness program (as defined by BIP141) to base32: *** Start with the bits of the witness program, most significant bit per byte first. *** Re-arrange those bits into groups of 5, and pad with zeroes at the end if needed. *** Translate those bits to characters using the table above.

        Decoding

        Software interpreting a segwit address:

        • MUST verify that the human-readable part is "bc" for mainnet and "tb" for testnet.
        • MUST verify that the first decoded data value (the witness version) is between 0 and 16, inclusive.
        • Convert the rest of the data to bytes:
        ** Translate the values to 5 bits, most significant bit first. ** Re-arrange those bits into groups of 8 bits. Any incomplete group at the end MUST be 4 bits or less, MUST be all zeroes, and is discarded. ** There MUST be between 2 and 40 groups, which are interpreted as the bytes of the witness program.

        Decoders SHOULD enforce known-length restrictions on witness programs. For example, BIP141 specifies ''If the version byte is 0, but the witness program is neither 20 nor 32 bytes, the script must fail.''

        As a result of the previous rules, addresses are always between 14 and 74 characters long, and their length modulo 8 cannot be 0, 3, or 5. Version 0 witness addresses are always 42 or 62 characters, but implementations MUST allow the use of any version.

        Implementations should take special care when converting the address to a scriptPubkey, where witness version n is stored as OP_n. OP_0 is encoded as 0x00, but OP_1 through OP_16 are encoded as 0x51 though 0x60 (81 to 96 in decimal). If a bech32 address is converted to an incorrect scriptPubKey the result will likely be either unspendable or insecure.

        Compatibility

        Only new software will be able to use these addresses, and only for receivers with segwit-enabled new software. In all other cases, P2SH or P2PKH addresses can be used.

        Rationale

        Reference implementations

        • Reference encoder and decoder:
        ** For C ** For C++ ** For JavaScript ** For Go ** For Python ** For Haskell ** For Ruby ** For Rust

        • Fancy decoder that localizes errors:
        ** For JavaScript (demo website)

        Registered Human-readable Prefixes

        SatoshiLabs maintains a full list of registered human-readable parts for other cryptocurrencies:

        SLIP-0173 : Registered human-readable parts for BIP-0173

        Appendices

        Test vectors

        The following strings are valid Bech32:

        • A12UEL5L
        • a12uel5l
        • an83characterlonghumanreadablepartthatcontainsthenumber1andtheexcludedcharactersbio1tt5tgs
        • abcdef1qpzry9x8gf2tvdw0s3jn54khce6mua7lmqqqxw
        • 11qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqc8247j
        • split1checkupstagehandshakeupstreamerranterredcaperred2y9e3w
        • ?1ezyfcl WARNING: During conversion to US-ASCII some encoders may set unmappable characters to a valid US-ASCII character, such as '?'. For example:
        >>> bech32_encode('\x80'.encode('ascii', 'replace').decode('ascii'), [])
        '?1ezyfcl'
        

        The following string are not valid Bech32 (with reason for invalidity):

        • 0x20 + 1nwldj5: HRP character out of range
        • 0x7F + 1axkwrx: HRP character out of range
        • 0x80 + 1eym55h: HRP character out of range
        • an84characterslonghumanreadablepartthatcontainsthenumber1andtheexcludedcharactersbio1569pvx: overall max length exceeded
        • pzry9x0s0muk: No separator character
        • 1pzry9x0s0muk: Empty HRP
        • x1b4n0q5v: Invalid data character
        • li1dgmt3: Too short checksum
        • de1lg7wt + 0xFF: Invalid character in checksum
        • A1G7SGD8: checksum calculated with uppercase form of HRP
        • 10a06t8: empty HRP
        • 1qzzfhee: empty HRP
        The following list gives valid segwit addresses and the scriptPubKey that they translate to in hex.
        • BC1QW508D6QEJXTDG4Y5R3ZARVARY0C5XW7KV8F3T4: 0014751e76e8199196d454941c45d1b3a323f1433bd6
        • tb1qrp33g0q5c5txs

[Content truncatedview full spec at source]

Discussion (0 threads)

Loading discussions...