A high-level view of low-level code

What is Parity Substrate?

If you’ve followed any of Polkadot’s development, you will probably have seen “Substrate” mentioned many times. It’s an important component of the Polkadot project but information on it is very thin on the ground. It’s not in the whitepaper or the yellow paper - or at least, not under the name “Substrate” - and the specification for it is still in heavy flux. At a high level, it’s a framework for creating cryptocurrencies and other decentralised systems using the latest research in blockchain technology. That’s not very helpful, though. At least, it’s not very helpful for me.

I think the most important part of understanding Parity Substrate is that it is not part of Polkadot at all. Although Polkadot is built with Substrate and projects built with Substrate can run natively on Polkadot, you can use Substrate to build new blockchains right now. You don’t need to wait for Polkadot to be finished or even for a proof-of-concept to be released to start working on a blockchain using this framework.

So what is Substrate? You can think of it as being like Express or another web application framework, but for building distributed or decentralised systems such as cryptocurrencies or a message bus. Just as most web applications shouldn’t need to reimplement their own version of HTTP, we believe that it’s wasted effort for every team creating a new blockchain to have to implement all the networking and consensus code from scratch. Not to mention the cryptographers, security researchers, networking engineers, devops personnel (to coordinate updates) and so on that would need to be hired and paid for when really your business logic is your product. If you want to build a new project using Substrate, all you have to do is implement a very small number of hooks in your code and then for free you get:

So what don’t you get for free? Essentially it’s just your state machine, which includes things like transactions. To make Substrate as generic as possible, it has no transactions. Instead, it has what we call “extrinsics”, which are just binary blobs that you can use to store any data that you want. For most chains these extrinsics will include transactions, but of course you don’t need to implement it that way! You could remove the concept of currency from the network entirely and use Substrate to create a decentralised Erlang-style actor-model concurrent system with a set of trusted authorities to verify the correct behaviour of the network. Assuming you do want currency and transactions, however, implementing the transaction format will likely be trivial - just an interchange format and a library to access that data from your chosen language. It’s even easier than other distributed architectures like microservices - since the code and the data it operates on is stored in the same place, you don’t need to enforce backwards-compatibility guarantees for transactions1, just for storage. For chains with private transactions the implementation may be more complex. The names of everything are not finalised and so you’ll see different language used in different places, but here’s a simple explanation of what you’d need to implement in order to get a full blockchain up and running:

One downside of this design is that you have to manually make sure that the state transitions done while creating a block and the state transitions done while executing an existing block are kept in sync. If you don’t do this, you could get consensus issues! This may change in the future, but for now this shouldn’t be much of a problem in practice as you will likely delegate the executing of extrinsics to a common function.

Additionally, you need to provide a validator set. This covers both proof-of-authority and proof-of-stake/delegated proof-of-stake chains, although we have no intention to support proof-of-work chains in Substrate as of now. The validator set is a list of public keys whose corresponding private keys should be considered valid to sign a given block. The set can change, but each block is validated by the set that was chosen at the time of the block’s creation. You don’t have to handle the difficult problem of handling the validators’ votes or even their “vouching” for individual blocks, that’s handled by Substrate automatically. The validator set can be as large as you like, but there’s a tradeoff to be made here. The less validators you have the easier it would be for them to collude, but the more validators that you have the more validators will be needed to validate any given block before it is considered “finalised” (i.e. unrevertable)2.

We can’t have Substrate automatically handle proof-of-stake for you, since proof-of-stake relies on your project including value-bearing tokens and not all projects will. Testnets may deliberately have tokens without value, and projects using Substrate to implement a message bus may not have tokens at all. However, it would be easy to write a library on top of Substrate that enforces the use of tokens and gives you transactions and proof-of-stake consensus automatically3. One thing about Substrate is that it’s relatively easy to build higher-level libraries on top of it. Even though you get a lot for free when building a new blockchain with Substrate, it’s still a relatively minimal set of primitives and it’s not really intended to be used directly. Instead, it should be taken as a building block and other common functionality can be factored into helper libraries. Although details haven’t been confirmed yet, Polkadot is not the only chain pegged to be built on Substrate; as the platform matures, more libraries can be built to make building new chains as easy as writing a modern web app.

I know that “coming soon"s in tech articles are about as trustworthy as a politician’s promise, but I’m going to end with one anyway. Although building on Substrate is already possible, we’re currently missing learning materials. Right now, there’s really no way that you could learn how to do any of what I just told you without already being part of the Polkadot team. We’re working on that, though, so if any of this excites you then keep your eye out for Substrate tutorials and documentation coming soon.

  1. Of course, realistically you would probably want to enforce backwards compatibility eventually so that external tools can easily interact with your chain, but while you’re developing you can play as fast-and-loose with compatibility as you like. Even when you do need to be backwards-compatible it doesn’t have to be hard, you can use protobuf to get efficient backwards-compatible storage for free. ↩︎

  2. As an example of the degenerate case of this effect, pure proof-of-work chains like Bitcoin or Ethereum cannot have finality at all because the set of possible validators is infinite. ↩︎

  3. For example, a simple proof-of-stake chain might set the validator set once every block, by selecting the 100 accounts with the largest stake and removing their stake if you get a proof of misbehaviour. ↩︎