Presentation | Video | Definitions |
---|---|---|
Validity Rollups | StarkNet 101 | Perama Blog |
Code examples will be named by the programming language in which they are implemented, for example, Bitcoin block verification in Golang (if you can implement this example in other languages we would love a PR):
The topics covered in this primer have been dissected in hundreds of ways by thousands of people, so wherever possible I will be linking to those resources.
In this course, you will learn how StarkWare attempts to tackle the Blockchain Trilemma and provide a system that is inclusively accountable, decentralized, scalable, and secure through the use of zero-knowledge STARK proofs.
🎯 Goals: secure, inclusively accountable, decentralized, scalable, expressive 🎯
For a more concrete example of the trilemma, we can move outside of the blockchain context entirely. Say Alice has an important piece of data she needs access to. To start we will represent this data as ASCII characters in YAML format:
alice_account: 5.00
Let's write it to a file on our computers disk and measure performance:
time echo "alice_account: 5.00" >> bank.yaml
Let's read that information:
time cat bank.yaml
It's obviously very fast to read and write this data from our local disk, and powerful database mechanisms can be applied to optimize access to the data. BUT if you drop your computer or get too close to a large ACME magnet Alice loses her valuable bank account information.
🎯
Goals:
secure,
inclusively accountable,
decentralized,
scalable,
expressive
🎯
💡 Let's replicate Alice's account on another computer 💡
If we replicate Alice's bank account YAML file on multiple computers, when one fails we haven't lost the data!Sender Questions:
- How do I locate a receiving host to send to?
- How do I know the receiving host successfully wrote Alice's account data?
- If I change Alice's account value how will the receiving host know to update the same value?
Receiver Questions:
- Who will I receive data from?
- If I change Alice's account value how will the sending host know to update the same value?
These questions form the basis of distributed systems and distributed computing across a network and have been studied since the inception of the internet.
Let's look briefly at how one of the more popular distributed databases CassandraDB handles these issues.
You can see when configuring the system you are required to whitelist the seed node
IP Addresses that will form our trusted cluster that partakes in a limited peer-to-peer gossip. Although this is suitable for many traditional systems we strive to build inclusive and permissionless systems.
Once the distributed database is set up we gain "Fault Tolerance" for Alice's valuable bank data. If someone accidentally brings their large ACME magnet into one data centre, the data is easily accessible on redundant hosts. Similar to blockchains these distributed systems made tradeoffs to the simple I/O example above. So what did we give up for this fault tolerance?
Banks Perspective:
- Network overhead impacts performance
- Redundancy and replication impacts performance
- Infrastructure maintenance ($$$$)
Alice's Perspective:
- Delegates trust to the bank:
- database is configured correctly
- operational security can handle attackers or intruders
- is not doing anything duplicitous
- etc.
- Costs typically get passed to Alice
🎯
Goals:
secure,
inclusively accountable,
decentralized,
scalable,
expressive
🎯
💡 Let's replicate Alice's account on ANY computer 💡
Bitcoin brings various computer science concepts together with game theory to create a truly peer-to-peer network and negates the need to delegate our trust to a central part.
The nodes trust the block producer based on its valid proof of work and the network collectively agrees on a set of canonical updates to the state of the Bitcoin ledger and the state of Alice's account.
In GO
:
# proof of work example
cd bitcoin/proof_of_work/go
go run main.go
In Python
:
# proof of work example
cd bitcoin/proof_of_work/python
python main.py
The Bitcoin nodes themselves listen for and validate blocks of transactions that are broadcast to the network by the miner of that block. They form a data structure called a Merkle Tree to obtain a root hash corresponding to all the transactions (and their order) in that block. If one tx changes by even a single bit the Merkle root will be completely different.
# block verification example
cd bitcoin/block_verification/go && go mod tidy
go run main.go utils.go
Alice's information gets formatted as a UTXO and is replicated on all of the nodes on the Bitcoin network. She can even validate that everything is accurate herself by rehashing the Merkle tree of every block of transactions from genesis to now.
🎉 NO DELEGATION OF TRUST 🎉
Let's revisit the trilemma. What did we give up to get this trustless data security?- Miners expend energy as they attempt to get the nonce
- Full trustless verification requires EACH node to replicate the canonical state:
- hash the Merkle tree of transactions
- hash the block header
Full Node Size: ~405GB
For a naive demonstration of "The Evolution of Data Security" run the following:
cd bitcoin/block_verification/go && go mod tidy
go test ./... -bench=. -count 5
🎯
Goals:
secure,
inclusively accountable,
decentralized,
scalable,
expressive
🎯
💡 Let's let Alice use her data 💡
Smart contracts were first proposed by Nick Szabo as a transaction protocol that executes the terms of a contract, giving all parties transparency into the rule set and execution. Bitcoin facilitates a limited version of smart contracts, but the expressive smart contract model of Ethereum has been more widely adopted.
Ethereum provides a platform to implement these smart contracts with the use of the Ethereum Virtual Machine. In the Ethereum paradigm, Alice's bank account information is stored in a 20-byte address called an account. Her account balance along with a few more fields (nonce, storageRoot, codeHash) becomes a "node" in a data structure called a Patricia Trie where PATRICIA stands for "Practical Algorithm to Retrieve Information Coded in Alphanumeric".
This Trie
is a specific type of tree that encodes a key
as a path of common prefixes to its corresponding value
. So Alice's Bank Account can be found at an address("key") that points to an account ("value") in Ethereum's World State (trie). The tree structure of the trie allows us to obtain a cryptographic hash of each node all the way up to a single hash corresponding to the root
similar to the Merkle tree we saw in the Bitcoin block verification.
For an example of the MPT data structure you can use this diagram for reference:
and run the following:
cd ethereum/block_verification/go && go mod tidy
go run *.go
Ethereum then propagates its state by verifying transactions are well-formed and applying then to accounts. Alice has a public/private key pair to manage her "externally owned account" and can sign transactions that involve her balance or involve interacting with other contracts in the state.
In addition to EOAs Ethereum has "contract accounts" which are controlled by the contract code associated with them. Every time the contract account receives a message the bytecode that is stored as an RLP encoded value in the account storage trie begins to execute according to the rules of the EVM.
Trilemma visit: what did we give up to add expressivity?
- Every transaction still needs to be processed by every node in the network.
- With the addition of world state storage the blockchain can "bloat" leading to centralization risk
- Alice may pay $100 to use the money in her account
Full Node Size: ~700 GB
Archive Node Size: ~10 TB
🎯
Goals:
secure,
inclusively accountable,
decentralized,
scalable,
expressive
🎯
💡 Let's optimize Alice's data utility 💡
As demand for block space increases the cost to execute on Layer 1
(full consensus protocols e.g. Bitcoin, Ethereum) will become increasingly expensive, and until certain state expiry mechanisms are implemented
we can expect the state of the L1 to continue to bloat over time. This will require an increasingly robust machine to maintain the state
and subsequently verify the blocks.
Rollups are one solution in which business logic is executed and stored in a protocol outside the Ethereum context and then proves its successful execution in the Ethereum context.
Typically this involves compressing a larger number of transactions at this Layer 2
and committing the state diffs to a smart contract deployed on L1.
For full interoperability with the L1 rollups also typically implement a messaging component for deposits and withdrawals.
There are currently two types of rollups that are being widely adopted:
- Optimistic Rollups
- Zero-Knowledge Rollups
Vitalik provides a good comparison of the two here, and touches on the final pieces of our long trilemma journey:
No matter how large the computation, the proof can be very quickly verified on-chain.
This allows Alice to move her money freely between L1 and L2 (...soon to be L3) and operate on a low-cost, expressive blockchain layer. All while inheriting the highest form of data security evolution from the L1 and not having to delegate trust to any centralized party!
🎯 Goals: secure, inclusively accountable, decentralized, scalable, expressive 🎯
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉