Blockchain is an example of distributed ledger systems and as such shares the same performance concerns as any other distributed system. In order to measure the performance of a distributed system with an acceptable degree of accuracy, it’s best to simplify as many of the variables under our control as possible. The size of the data being moved around the network is one of the most fundamental metrics of performance and scalability and typically the most difficult to understand and manage. The original blockchain paper implemented an elegant solution to this problem: the Merkle tree.
Size
In blockchain systems, a block must hold a complete history of all valid transactions. The size of a block varies by the length of the history and the content stored. A hash is a common computer science technique to reduce a variable-length string to a unique fixed-length string. Blockchain uses a Merkle tree. The leaves in a Merkle tree are built from the bottom-up with each leaf containing hashed transactions from a single block. A parent is a result of concatenating two consecutive leaves and producing a new fixed-length hash. This progresses up the last parent, called the Merkle root. In Bitcoin, each block is 80-bytes, regardless of the number of transactions. A Merkle proof allows decoding the 80-byte block into its original size. The block size in Bitcoin is limited to 1MB, so there can be 1000 transactions in a block. Ethereum uses three Merkle roots in each block; one for transactions, on for state, and one for transaction receipts.
Speed
Speed, or throughput, is the next variable that needs to be well understood to properly architect a distributed network. Speed refers to the time it takes for an algorithm to process a single transaction. This is why we can measure relative performance in terms of transactions per second (tps). The Bitcoin network is constrained by the block size limit of 1MB and the average block creation time of 10 minutes. The bitcoin network can process ~5-7 tps while Visa can process ~1500 tps. A combination of solutions, including a faster network (such as the Lightning Network Project) and a faster transaction protocol (such as SegWit), will likely be necessary for Bitcoin’s transaction rate to be feasible as network payment processing protocol.
The Future of Big Data
With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.
Ethereum’s protocol is faster than Bitcoin’s protocol on a public network, particularly after the Istanbul fork. Performance on permissioned networks with Ethereum and Hyperledger approaches 3,500 tps, which is still about 1/10th the performance of a relational database running similar workloads. The difference in performance between permissioned and public networks is related to establishing the trustworthiness of the peers in the network. In permissioned networks, like some corporate implementations of Hyperledger Fabric where all of the participants are known, the process is relatively straightforward.
Summary
Blockchain is not the most performant distributed system but that is not its strength. You can get amazing performance from an in-memory key-value data store, but you do not get byzantine fault tolerance, immutable records connected cryptographically. Performance is a relative term that is framed in the context of a particular use case. The performance characteristics of the bitcoin network versus the Visa network may make it impractical for certain types of payment processing, but this has certainly not been a constraint for Ripple.
Selecting any distributed system requires careful analysis because there are a lot of knobs to tune. Performance, scalability, resiliency, availability, consistency, integrity, and cost are all factors. Understanding the tradeoffs and impacts will allow you to move past statements like “blockchain is slower than a relational database” and think more strategically about problems you need to solve.