David Bakin’s programming blog.

A Readable Production Quality Bitcoin Service

Introducing an ongoing project to reimplement Bitcoin Core as an educational resource, providing:

  • A code base that is easy to read: so the architecture, design, and code of each element that makes up a working Bitcoin node can be understood separately and together,
  • Blockchain data (blocks, transactions, script, etc.) that are transparent as possible to the human reader (and easy to work with with separate tools),
  • Explicit incorporation of the various BIPs that have changed/improved/grown Bitcoin over the years, showing how they’re different from what came before, and how they interact with the existing Bitcoin system (and each other),
  • a showcase for the architecture of such a production system,
  • Alternative implementations of the key components, so that design choices can be compared and evaluated, and
  • A working Bitcoin node that can be a testbed for different clients, and for different ways of looking at Bitcoin’s data, and a breadboard for experimenting with alternate policies of how to run Bitcoin.

The state of Bitcoin Core

The first, the canonical, the reference, and the must-use-for-miners implementation of Bitcoin is Bitcoin Core (also see here at bitcoin.org) (source repository).

This codebase was first shown to the world, by Satoshi Nakamoto, in November 2008, when he sent a few of the files to some crypto-savvy engineers for their evaluation. Two months later he sent out a complete working first release, labled 0.1.0, with instructions on how to bring it up as a node on a fledgling network, and start mining coins.

As a reward for supporting the network, you receive coins when you successfully generate a block.

This, he warned, “may take days, or months, depending on the speed of your computer and the competition on the network”.

That first release was already recognizably bitcoin:

  • a blockchain,
  • made of blocks, which were
  • made of transactions,
  • those transactions described by a small machine language (“Script”),
  • ability to make valid transactions,
  • keeping a wallet,
  • generating valid blocks from pending transactions
  • validating the blocks and the blockchain,
  • and the peer-to-peer network to tie all the nodes together.

The 0.1.3 release came a few days later. (Source code of these releases are all at the [Satoshi Nakamoto Institute](https://satoshi.nakamotoinstitute.org/code/ site.) site.)

Since then, this C++ foundation has been modifed to account for a changing understanding of how bitcoin should work, greatly expanded with features, and of course, patched with bug fixes. To the point now that the source repository - which starts at version 0.1.5 - has a history of over 30,000 commits, and shows a code growth of 1¼ orders of magnitude:

Bitcoin Core version files lines of code loc outside of headers
0.1.3 (2009, first release) 26 20000 12000
0.21.1 (current) 1080 254000 175000

12 years of work. 30000 commits by a varying team of engineers. Continuously in production yet continuously growing, and continuously evolving. This is a legacy codebase.

And like all legacy codebases, no matter the skill and care of the engineers working on it (and these engineers have been and are skilled and careful) it shows its age in … rough spots. Rough spots of code, rough spots of design, even rough spots of architecture. Of course it has areas which could do with some attention/rework/rewriting … if only the developer’s didn’t have more urgent things to do (usually new features) and if only it wasn’t continuously in production producing an ever-growing, immutable once created, database storing and tracking over $600 billion in value, such that any regression released to production will lead to the immediate, painful, and embarrassing question: Why on earth did you change that code? It wasn’t necessary to touch it, it was working fine!

And, also, like all legacy codebases, no matter the skill and care of the engineers working on it, there are also areas which the engineers won’t touch even if they had the time because … it would be too dangerous. Too much complexity and too many interactions with other parts of the codebase - or the bitcoin system as a whole - to understand what a code change would do.

(Legacy and difficult to work with: yes. But by no means terrible. The code is working fine, the rough spots aren’t that bad, and the engineers are very good. Features can be added and bugs fixed. But, still, things are tricky in there.)

Blocksifter: A reimplementation

There are other reimplementations of Bitcoin that are available, open, and working. (And probably several that are working, but not open.) Two that I know of are:

  • bcoin - “Enterprise-level Bitcoin and Blockchain libraries. Built for businesses, miners, wallets, and hobbyists” - JavaScript
  • BTCSuite - “A suite of packages and tools for working with Bitcoin in Go (golang) including btcd, a full node, mining capable, Bitcoin implementation” - Go

There are others, and there are many libraries (for various languages) that provide some or nearly all Bitcoin node functionality.

This project is not to replace them. It’s for educational purposes: to make the Bitcoin node an easier system to understand - and for testbed/breadboard uses.

And, it will be in modern C++. Ultramodern, actually: C++17 with C++20 features as compiler support for C++20 is now becoming widespread (and reliable).

In addition to the goals listed above it will also be “production-ready”, including features and coding styles necessary for production systems:

  • modularity,
  • reliability,
  • troubleshooting aids,
  • performance metrics,
  • API versioning,
  • etc.

Coming soon: Repository will soon move off of my local machines and into a public repository. Last thing I need to do before that is decide on a license. (MIT, 3-clause BSD, or maybe LGPLv3.)