← Back to BitTorrent Enhancement Proposals
BEP 52specificationDraftp2p

The BitTorrent Protocol Specification v2

BitTorrent is a protocol for distributing files. It identifies content by URL and is designed to integrate seamlessly with the web. Its advantage over plain HTTP is that when multiple downloads of the same file happen concurrently, the downloaders upload to each other, making it possible for the file source to support very large numbers of downloaders with only a modest increase in its load.

No reviews
Bram Cohen·Updated Mar 29, 2026·0 reviews·0 attestations·View source
Collections:BEPs — Merged

Specification

BitTorrent is a protocol for distributing files. It identifies content

by URL and is designed to integrate seamlessly with the web. Its

advantage over plain HTTP is that when multiple downloads of the same

file happen concurrently, the downloaders upload to each other, making

it possible for the file source to support very large numbers of

downloaders with only a modest increase in its load.

----------------------------------------------------------

A BitTorrent file distribution consists of these entities:

  • An ordinary web server
  • A static 'metainfo' file
  • A BitTorrent tracker
  • An 'original' downloader
  • The end user web browsers
  • The end user downloaders
  • There are ideally many end users for a single file.

    ----------------------------------------------------------

    To start serving, a host goes through the following steps:

    1. Start running a tracker (or, more likely, have one running already).
    2. Start running an ordinary web server, such as apache, or have one already.
    3. Associate the extension .torrent with mimetype application/x-bittorrent on their web server (or have done so already).
    4. Generate a metainfo (.torrent) file using the complete file to be served and the URL of the tracker.
    5. Put the metainfo file on the web server.
    6. Link to the metainfo (.torrent) file from some other web page.
    7. Start a downloader which already has the complete file (the 'origin').

    ------------------------------------------------

    To start downloading, a user does the following:

    1. Install BitTorrent (or have done so already).
    2. Surf the web.
    3. Click on a link to a .torrent file.
    4. Select where to save the file locally, or select a partial download to resume.
    5. Wait for download to complete.
    6. Tell downloader to exit (it keeps uploading until this happens).

    ---------

    bencoding

  • Strings are length-prefixed base ten followed by a colon and the string.
  • For example 4:spam corresponds to 'spam'.

  • Integers are represented by an 'i' followed by the number in base 10
  • followed by an 'e'. For example i3e corresponds to 3 and

    i-3e corresponds to -3. Integers have no size

    limitation. i-0e is invalid. All encodings with a leading

    zero, such as i03e, are invalid, other than

    i0e, which of course corresponds to 0.

  • Lists are encoded as an 'l' followed by their elements (also
  • bencoded) followed by an 'e'. For example l4:spam4:eggse

    corresponds to ['spam', 'eggs'].

  • Dictionaries are encoded as a 'd' followed by a list of alternating
  • keys and their corresponding values followed by an 'e'. For example,

    d3:cow3:moo4:spam4:eggse corresponds to {'cow': 'moo',

    'spam': 'eggs'} and d4:spaml1:a1:bee corresponds to

    {'spam': ['a', 'b']}. Keys must be strings and appear in sorted order

    (sorted as raw strings, not alphanumerics).

    Note that in the context of bencoding strings including dictionary keys

    are arbitrary byte sequences (uint8_t[]).

    BEP authors are encouraged to use ASCII-compatible strings for dictionary keys

    and UTF-8 for human-readable data. Implementations must not rely on this.

    --------------

    metainfo files

    Metainfo files (also known as .torrent files) are bencoded dictionaries

    with the following keys:

    announce

    The URL of the tracker.

    info

    This maps to a dictionary, with keys described below.

    piece layers

    A dictionary of strings. For each file in the file tree that is larger than the piece size

    it contains one string value.

    The keys are the merkle roots while the values consist of concatenated hashes

    of one layer within that merkle tree.

    The layer is chosen so that one hash covers piece length bytes.

    For example if the piece size is 16KiB then the leaf hashes are used.

    If a piece size of 128KiB is used then 3rd layer up from the leaf hashes is used.

    Layer hashes which exclusively cover data beyond the end of file,

    i.e. are only needed to balance the tree, are omitted.

    All hashes are stored in their binary format.

    A torrent is not valid if this field is absent, the contained hashes do not match

    the merkle roots or are not from the correct layer.

    All strings in a .torrent file defined by this BEP that contain human-readable text

    are UTF-8 encoded.

    An example torrent creator implementation can be found here_.

    info dictionary

    name

    A display name for the torrent. It is purely advisory.

    piece length

    The number of bytes that each logical piece in the peer protocol refers to.

    I.e. it sets the granularity of piece, request, bitfield and have

    messages. It must be a power of two and at least 16KiB.

    Files are mapped into this piece address space so that each non-empty file is

    aligned to a piece boundary and occurs in the same order as in the file tree.

    The last piece of each file may be shorter than the specified piece length, resulting

    in an alignment gap.

    meta version

    An integer value, set to 2 to indicate compatibility with the current revision of this

    specification. Version 1 is not assigned to avoid confusion with BEP3.

    Future revisions will only increment this value to indicate an incompatible

    change has been made, for example that hash algorithms were changed due to newly discovered

    vulnerabilities. Implementations must check this field first and indicate that a torrent

    is of a newer version than they can handle before performing other validations which may

    result in more general messages about invalid files.

    file tree

    A tree of dictionaries where dictionary keys represent UTF-8 encoded path elements.

    Entries with zero-length keys describe the properties of the composed path at that point.

    'UTF-8 encoded' in this context only means that if the native encoding is known at creation

    time it must be converted to UTF-8.

    Keys may contain invalid UTF-8 sequences or characters and names that are reserved on

    specific filesystems. Implementations must be prepared to sanitize them.

    On most platforms path components exactly matching '.' and '..' must be sanitized

    since they could lead to directory traversal attacks and conflicting path descriptions.

    On platforms that require valid UTF-8 path components this sanitizing step must happen

    after normalizing overlong UTF-8 encodings.

    The file tree root dictionary itself must not be a file, i.e. it must not contain

    a zero-length key with a dictionary containing a length key.

    File tree layout

    Example:

    .. parsed-literal

        {
          info: {
            file tree: {
              dir1: {
                dir2: {
                  fileA.txt: {
                    "": {
                      length: *<length of file in bytes (integer)>*,
                      pieces root: *<optional, merkle tree root (string)>*,
                      ...
                    }
                  },
                  fileB.txt: {
                    "": {
                      ...
                    }
                  }
                },
                dir3: {
                  ...
                }
              }
            }
          }
        }
    

    Bencoded for fileA only

        d4:infod9:file treed4:dir1d4:dir2d9:fileA.txtd0:d5:lengthi1024e11:pieces root32:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaeeeeeee
    
    

    length

    Length of the file in bytes. Presence of this field indicates

    that the dictionary describes a file, not a directory. Which means

    it must not have any sibling entries.

    pieces root

    For non-empty files this is the the root hash of a merkle tree

    with a branching factor of 2, constructed from 16KiB blocks of the file.

    The last block may be shorter than 16KiB.

    The remaining leaf hashes beyond the end of the file required

    to construct upper layers of the merkle tree are set to zero.

    As of meta version 2 SHA2-256 is used as digest function for the merkle tree.

    The hash is stored in its binary form, not as human-readable string.

    Note that identical files always result in the same root hash.

    Interpreting paths:

    file tree: {name.ext: {"": {length: ...}}}

    a single-file torrent

    file tree: {nameA.ext: {"": {length: ...}}, nameB.ext: {"": {length: ...}}, dir: {...}}

    a rootless multifile torrent, i.e. a list of files and directories without a named common directory containing them.

    implementations may offer users to optionally prepend the torrent name as root to avoid file name collisions.

    file tree: {dir: {nameA.ext: {"": {length: ...}}, nameB.ext: {"": {length: ...}}}}

    multiple files rooted in a single directory

    --------

    infohash

    The infohash is calculated by applying a hash function to the bencoded form of the info dictionary,

    which is a substring of the metainfo file. For meta version 2 SHA2-256 is used.

    The info-hash must be the hash of the encoded form as found

    in the .torrent file, which is identical to bdecoding the metainfo file,

    extracting the info dictionary and encoding it if and only if the

    bdecoder fully validated the input (e.g. key ordering, absence of leading zeros).

    Conversely that means implementations must either reject invalid metainfo files

    or extract the substring directly.

    They must not perform a decode-encode roundtrip on invalid data.

    For some uses as torrent identifier it is truncated to 20 bytes.

    When verifying an infohash implementations must also check that the piece layers

    hashes outside the info dictionary match the pieces root fields.

    --------

    trackers

    Tracker GET requests have the following keys:

    info_hash

    The 20byte truncated infohash as described above.

    This value will almost certainly have to be escaped.

    peer_id

    A string of length 20 which this downloader uses as its id. Each

    downloader generates its own id at random at the start of a new

    download. This value will also almost certainly have to be escaped.

    ip

    An optional parameter giving the IP (or dns name) which this peer is

    at. Generally used for the origin if it's on the same machine as the

    tracker.

    port

    The port number this peer is listening on. Common behavior is for a

    downloader to try to listen on port 6881 and if that port is taken try

    6882, then 6883, etc. and give up after 6889.

    uploaded

    The total amount uploaded so far, encoded in base ten ascii.

    downloaded

    The total amount downloaded so far, encoded in base ten ascii.

    left

    The number of bytes this peer still has to download, encoded in

    base ten ascii. Note that this can't be computed from downloaded and

    the file length since it might be a resume, and there's a chance that

    some of the downloaded data failed an integrity check and had to be

    re-downloaded.

    event

    This is an optional key which maps to started,

    completed, or stopped (or

    empty, which is the same as not being present). If not

    present, this is one of the announcements done at regular

    intervals. An announcement using started is sent when a

    download first begins, and one using completed is sent

    when the download is complete. No completed is sent if

    the file was complete when started. Downloaders send an announcement

    using stopped when they cease downloading.

    Tracker responses are bencoded dictionaries. If a tracker response

    has a key failure reason, then that maps to a human

    readable string which explains why the query failed, and no other keys

    are required. Otherwise, it must have two keys: interval,

    which maps to the number of seconds the downloader should wait between

    regular rerequests, and peers. peers maps to

    a list of dictionaries corresponding to peers, each of

    which contains the keys peer id, ip, and

    port, which map to the peer's self-selected ID, IP

    address or dns name as a string, and port number, respectively. Note

    that downloaders may rerequest on nonscheduled times if an event

    happens or they need more peers.

    More commonly is that trackers return a compact representation of

    the peer list, see BEP 23 and BEP 7.

    If you want to make any extensions to metainfo files or tracker

    queries, please coordinate with Bram Cohen to make sure that all

    extensions are done compatibly.

    It is common to announce over a UDP tracker protocol as well.

    -------------

    peer protocol

    BitTorrent's peer protocol operates over TCP or uTP.

    Peer connections are symmetrical. Messages sent in both directions

    look the same, and data can flow in either direction.

    The peer protocol refers to pieces of the file by index as

    described in the metainfo file, starting at zero. When a peer finishes

    downloading a piece and checks that the hash matches, it announces

    that it has that piece to all of its peers.

    Connections contain two bits of state on either end: choked or not,

    and interested or not. Choking is a notification that no data will be

    sent until unchoking happens. The reasoning and common techniques

    behind choking are explained later in this document.

    Data transfer takes place whenever one side is interested and the

    other side is not choking. Interest state must be kept up to date at

    all times - whenever a downloader doesn't have something they

    currently would ask a peer for in unchoked, they must express lack of

    interest, despite being choked. Implementing this properly is tricky,

    but makes it possible for downloaders to know which peers will start

    downloading immediately if unchoked.

    Connections start out choked and not interested.

    When data is being transferred, downloaders should keep several

    piece requests queued up at once in order to get good TCP performance

    (this is called 'pipelining'.) On the other side, requests which can't

    be written out to the TCP buffer immediately should be queued up in

    memory rather than kept in an application-level network buffer, so

    they can all be thrown out when a choke happens.

    The peer wire protocol consists of a handshake followed by a

    never-ending stream of length-prefixed messages. The handshake starts

    with character nineteen (decimal) followed by the string 'BitTorrent

    protocol'. The leading character is a length prefix, put there in the

    hope that other new protocols may do the same and thus be trivially

    distinguishable from each other.

    All later integers sent in the protocol are encoded as four bytes

    big-endian.

    After the fixed headers come eight reserved bytes, which are all

    zero in all current implementations. If you wish to extend the

    protocol using these bytes, please coordinate with Bram Cohen to make

    sure all extensions are done compatibly.

    Next comes the 20 byte truncated infohash. If both sides don't send the same value,

    they sever the connection. The one possible exception is if a downl

    [Content truncatedview full spec at source]

    Discussion (0 threads)

    Loading discussions...