Building A Better BitTorrent Client In Go

When it comes to peer-to-peer file sharing protocols, BitTorrent is probably one of the best known. It requires a client implementing the program and a tracker to list files available to transfer and to find peer users to transfer those files. Developed in 2001, BitTorrent has since acquired more than a quarter billion users according to some estimates.

While most users choose to use existing clients, [Jesse Li] wanted to build one from scratch in Go, a programming language commonly used for its built-in concurrency features and simplicity compared to C.

Client-server versus peer-to-peer architecture
Client-server versus peer-to-peer architecture

The first step for a client is finding peers to download files from. Trackers, web servers running over HTTP, serve as centralized locations for introducing peers to one another. Due to the centralization, the servers are at risk of being discovered and shut down if they facilitate illegal content exchange. Thus, making peer discovery a distributed process is a necessity for preventing trackers from following in the footsteps of the now-defunct TorrentSpy, Popcorn Time, and KickassTorrents.

The client starts off by reading a .torrent file, which describes the contents of the desired file and how to connect to a tracker. The information in the file includes the URL of the tracker, the creation time, and SHA-1 hashes of each piece, or a chunk of the file. One file can be made up of thousands of pieces – the client will need to download the pieces from peers, check the hashes against the torrent file, and finally assemble the pieces together to finally retrieve the file. For the implementation, [Jesse] chose to keep the structures in the Go program reasonably flat, separating application structs from serialization structs. Pieces are also separated into slices of hashes to more easily access individual hashes.

The bitfield explained as a coffee shop loyalty card.
The bitfield explained as a coffee shop loyalty card.

Next, a GET request to an `announce` URL in the torrent file announces the presence of the client to peers and retrieves a response from the tracker with the list of peers. To start downloading pieces, the client starts a TCP connection with a peer, completes a two-way BitTorrent handshake, and exchanges messages to download pieces.

One interesting data structure exchanged in the messages is a bitfield, which acts as a byte array that checks which pieces a peer has. Bits are flipped when their respective piece’s status changes, acting somewhat like a loyalty card with stamps.

While talking to one peer may be straightforward, managing the concurrency of talking to multiple peers at once requires solving a classically Hard problem. [Jesse] implements this in Go by using channels as thread-safe queues, setting up two channels to assign work and collect downloaded pieces. Requests are later pipelined to increase throughput since network round-trips are expensive and sending blocks individually inefficient.

The full implementation is available on GitHub, and is easy enough to use as an alternative client or as a walkthrough if you’d prefer to build your own.

Should You Get A Seedbox For Your Bittorrent Needs?


Torrentfreak offers up a few reasons why you should get a seedbox if you’re a bittorrent user who likes to share a lot of files. A seedbox is a dedicated private server used exclusively for torrent transfers. [sharky] discusses a few pros and makes a few claims that we think might be a little overblown. Although the seedbox will speed up your downloads and allow you to bypass ISP limits on your bandwith, we’re a little leery of the claims that the seedbox is completely safe and secure, or that it’ll protect you from getting sued by the RIAA or MPAA. As pointed out in the comments, paying for a dedicated hosting service and paying for cable is no different. Of course, the seedbox also costs money, so you’ll have to weigh whether you’d rather have speed or risk getting throttled by your ISP. Torrentfreak does list a few hosting solutions that may be reasonably priced.

[photo: nrkbeta]