BitTorrent's UDP Tracker Protocol

Introduction#

To find other nodes in a BitTorrent download swarm, a client sends a request to a tracker to announce itself. This request uses the HTTP protocol and includes several parameters, such as info_hash, key, peer_id, port, downloaded, left, uploaded, and compact. The tracker returns a set of nodes (hosts and ports) along with some other information to the client. Both the request and response are very brief. Since TCP protocol is used, a connection needs to be opened before sending the request and closed after completing the request, which introduces additional overhead.

System Overhead#

Using the HTTP protocol incurs significant system overhead because there are overheads at the Ethernet layer, IP layer, TCP layer, and HTTP layer, and a request plus response containing 50 nodes requires about 10 packets, totaling approximately 1206 bytes. By using a UDP-based protocol, this overhead can be significantly reduced; the protocol proposed in this paper requires only 4 packets and about 618 bytes, reducing traffic by 50%. Although saving 1KB per hour may not be significant for the client, reducing traffic by 50% is very important for a Tracker serving millions of nodes. Additionally, a UDP-based binary protocol does not require complex parsers and connection handling, reducing the complexity of Tracker code and improving its performance.

UDP Connection / Spoofing Attack#

Ideally, communication can be completed with just two packets. When using the UDP protocol, since it is connectionless, there may be cases of spoofed source addresses. Therefore, the tracker must take measures to ensure that spoofing does not occur. To avoid this situation, the Tracker (a server program used to coordinate data sharing between clients and other peers in a network) takes certain measures.

When a client sends a request to the tracker, the tracker generates a random number (connection_id) and sends it back to the client. When the client sends another request to the tracker, it must include this connection_id so that the tracker can verify the legitimacy of the request's source. If the client spoofed the source address, it would not receive the connection_id sent back by the tracker, and thus the request would be rejected by the tracker.

To ensure that the connection_id cannot be guessed by the client, the tracker can adopt methods similar to TCP handshake and syn-cookie, storing the connection_id on the server side and only returning it to the client under specific circumstances. A connection_id can be used for multiple requests, and the client can use it within one minute after receiving it. The tracker should accept it within two minutes after issuing the connection_id and verify whether the requests from the client match this connection_id.

Timeout#

UDP is a protocol that does not guarantee reliable packet delivery, meaning that if packets are lost during transmission, UDP will not automatically retransmit them. Instead, applications using UDP need to handle lost packets and retransmit them as necessary.

To handle lost packets, the client should wait for the server's response after sending a request. If no response is received after 15 * 2^n seconds, the client should resend the request. The time interval for resending requests starts at 15 seconds, doubling with each attempt, up to a maximum of 8 attempts, totaling 3840 seconds.

It is important to note that if the connection ID has expired, a new connection ID must be requested before resending the request. This ensures that the request will be correctly authenticated and processed by the server.

Example#

Regular broadcast:

t = 0: connect request
t = 1: connect response
t = 2: announce request
t = 3: announce response

Connection timeout:

t = 0: connect request
t = 15: connect request
t = 45: connect request
t = 105: connect request
etc

Broadcast timeout:

t = 0:
t = 0: connect request
t = 1: connect response
t = 2: announce request
t = 17: announce request
t = 47: announce request
t = 107: connect request (because connection ID expired)
t = 227: connect request
etc

Multiple requests:

t = 0: connect request
t = 1: connect response
t = 2: announce request
t = 3: announce response
t = 4: announce request
t = 5: announce response
t = 60: announce request
t = 61: announce response
t = 62: connect request
t = 63: connect response
t = 64: announce request
t = 64: scrape request
t = 64: scrape request
t = 64: announce request
t = 65: announce response
t = 66: announce response
t = 67: scrape response
t = 68: scrape response

UDP Tracker Protocol#

All values (e.g., integers or floating-point numbers) should be encoded in network byte order (big-endian) when sent. Additionally, it should not be expected that each packet has a fixed size, as the size of packets may increase in the future with the addition of new features.

Connection#

Before announcing or scraping, you have to obtain a connection ID.

Choose a random transaction ID.
Fill the connect request structure.
Send the packet.

The process of obtaining a connection ID before performing "announcing" or "scraping" operations is as follows:

Randomly generate a Transaction ID as a unique identifier.
Fill in the connect request structure.
Send the request packet.

Connect request:

Offset  Size            Name            Value
0       64-bit integer  protocol_id     0x41727101980 // magic constant
8       32-bit integer  action          0 // connect
12      32-bit integer  transaction_id
16

Receive the packet.
Check if its length is at least 16 bytes.
Check if the Transaction ID in the received packet matches the ID chosen when the request was sent.
Check if the action in the packet is "connect".
Store the Connection ID from the packet locally for subsequent operations.

Connect response:

Offset  Size            Name            Value
0       32-bit integer  action          0 // connect
4       32-bit integer  transaction_id
8       64-bit integer  connection_id
16

Announce#

Randomly select a Transaction ID from the system.
Fill in an announce request structure.
Send the packet.

IPv4 announce request:

Offset  Size    Name    Value
0       64-bit integer  connection_id
8       32-bit integer  action          1 // announce
12      32-bit integer  transaction_id
16      20-byte string  info_hash
36      20-byte string  peer_id
56      64-bit integer  downloaded
64      64-bit integer  left
72      64-bit integer  uploaded
80      32-bit integer  event           0 // 0: none; 1: completed; 2: started; 3: stopped
84      32-bit integer  IP address      0 // default
88      32-bit integer  key
92      32-bit integer  num_want        -1 // default
96      16-bit integer  port
98

Receive the packet.
Check if the length of the packet is at least 20 bytes.
Check if the transaction ID in the packet matches the previously chosen transaction ID to ensure that the response received is for the previously sent announce request.
Check if the action in the packet is "announce".
Perform a time interval check. If the time since the last announce request is less than a certain interval (interval seconds), do not make another announce request; otherwise, continue with the announce request or wait for an event to trigger another request.

Most Trackers only consider the IP address field under certain specific circumstances.

IPv4 announce request:

Offset      Size            Name            Value
0           32-bit integer  action          1 // announce
4           32-bit integer  transaction_id
8           32-bit integer  interval
12          32-bit integer  leechers
16          32-bit integer  seeders
20 + 6 * n  32-bit integer  IP address
24 + 6 * n  16-bit integer  TCP port
20 + 6 * N

IPv6#

The differences in protocol structure between IPv6 and IPv4 and how to adapt to their use. The message format for IPv6 and IPv4 is essentially the same, but in the response message, the step size for the <IP address, TCP port> pair changes from 6 bytes to 18 bytes. Additionally, in the request message, the IP address field remains 32 bits wide, which cannot be used for IPv6 and should always be set to 0.

The paragraph also mentions the adaptation method: determining the format used based on the address family of the UDP packet, meaning that packets from IPv4 addresses use the IPv4 format, while packets from IPv6 addresses use the IPv6 format. Finally, for clients that resolve hostnames to both IPv4 and IPv6 and use the same key for both transmissions, it is essential to ensure that the tracker can accurately match both announcements for accurate statistics.

Scrape#

Up to about 74 seed files' information can be obtained simultaneously. This protocol cannot complete a full data scrape.

Randomly select a Transaction ID.
Fill in the scrape request structure.
Send the packet.

Scrape request:

Offset          Size            Name            Value
0               64-bit integer  connection_id
8               32-bit integer  action          2 // scrape
12              32-bit integer  transaction_id
16 + 20 * n     20-byte string  info_hash
16 + 20 * N

Receive the packet.
Check if the length of the packet is at least 8 bytes.
Check if the Transaction ID in the packet matches the one you previously selected.
Check if the action in the packet is "scrape".

Scrape response:

Offset      Size            Name            Value
0           32-bit integer  action          2 // scrape
4           32-bit integer  transaction_id
8 + 12 * n  32-bit integer  seeders
12 + 12 * n 32-bit integer  completed
16 + 12 * n 32-bit integer  leechers
8 + 12 * N

If the tracker encounters an error, it may send an error packet.

Receive the packet.
Check if the length of the packet is at least 8 bytes.
Check if the Transaction ID in the packet matches the one you previously selected.

Error#

Error response:

Offset  Size            Name            Value
0       32-bit integer  action          3 // error
4       32-bit integer  transaction_id
8       string  message

Existing Instances#

IMFile, Azureus, libtorrent, opentracker, XBT Client, and XBT Tracker support this protocol.

Plugins#

To maintain protocol compatibility, extension bits or version fields are generally not included in the protocol. Clients and Trackers should also not assume a fixed size for packets. This allows for the addition of extra fields without breaking compatibility. In other words, by avoiding fixed packet lengths and formats, the protocol can be more easily extended and updated while maintaining backward compatibility.

Summary#

The UDP Tracker Protocol for BitTorrent is a tracker communication protocol used in the peer-to-peer file-sharing protocol BitTorrent. It is based on the UDP protocol, which allows for faster data transmission compared to the HTTP protocol, while also offering better scalability and efficiency.

Through the UDP Tracker Protocol, BitTorrent clients can send requests to the tracker to obtain a list of other users connected to the torrent. These users can help downloaders provide file pieces, increasing download speeds. Additionally, the tracker can provide statistics about specific torrents, such as upload and download speeds, remaining time, etc.

Despite the many advantages of the UDP Tracker Protocol, its stateless nature may lead to some issues in certain situations, such as occasionally losing requests or responses. To address these issues, many BitTorrent clients also support the HTTP Tracker Protocol, using both tracking communication protocols to improve reliability.