Skip to content

Latest commit

 

History

History
350 lines (224 loc) · 15.2 KB

state-network.md

File metadata and controls

350 lines (224 loc) · 15.2 KB

Execution State Network

This document is the specification for the sub-protocol that supports on-demand availability of state data from the execution chain.

Overview

The execution state network is a Kademlia DHT that uses the Portal Wire Protocol to establish an overlay network on top of the Discovery v5 protocol.

State data from the execution chain consists of all account data from the main storage trie, all contract storage data from all of the individual contract storage tries, and the individul bytecodes for all contracts across all historical state roots. This is traditionally referred to as an "archive node".

Data

The network stores the full execution layer state which emcompases the following:

  • Account trie nodes
  • Contract storage trie nodes
  • Contract bytecode

The network is implemented as an "archive" node meaning that it stores all tries for all historical blocks.

Retrieval

  • Account trie nodes by their node hash.
  • Contract storage trie nodes by their node hash.
  • Contract bytecode by code hash.

Specification

Distance Function

The state network uses the stock XOR distance metric defined in the portal wire protocol specification.

Content ID Derivation Function

The state network uses the SHA256 Content ID derivation function from the portal wire protocol specification.

Wire Protocol

Protocol Identifier

As specified in the Protocol identifiers section of the Portal wire protocol, the protocol field in the TALKREQ message MUST contain the value of 0x500A.

Supported Message Types

The execution state network supports the following protocol messages:

  • Ping - Pong
  • Find Nodes - Nodes
  • Find Content - Found Content
  • Offer - Accept

Ping.custom_data & Pong.custom_data

In the execution state network the custom_payload field of the Ping and Pong messages is the serialization of an SSZ Container specified as custom_data:

custom_data = Container(data_radius: uint256)
custom_payload = SSZ.serialize(custom_data)

POKE Mechanism

The POKE Mechanism MUST be disabled for the state network. As content_for_retrieval is different from content_for_offer the POKE mechanism cannot offer content that is verifiable.

Routing Table

The execution state network uses the standard routing table structure from the Portal Wire Protocol.

Node State

Data Radius

The execution state network includes one additional piece of node state that should be tracked. Nodes must track the data_radius from the Ping and Pong messages for other nodes in the network. This value is a 256 bit integer and represents the data that a node is "interested" in. We define the following function to determine whether node in the network should be interested in a piece of content.

interested(node, content) = distance(node.id, content.id) <= node.radius

A node is expected to maintain radius information for each node in its local node table. A node's radius value may fluctuate as the contents of its local key-value store change.

A node should track their own radius value and provide this value in all Ping or Pong messages it sends to other nodes.

Data Types

OFFER/ACCEPT vs FINDCONTENT/FOUNDCONTENT payloads

The data payloads for many content types in the state network differ between OFFER/ACCEPT and FINDCONTENT/FOUNDCONTENT.

The OFFER/ACCEPT payloads need to be provable by their recipients. These proofs are useful during OFFER/ACCEPT because they verify that the offered data is indeed part of the canonical chain.

The FINDCONTENT/FOUNDCONTENT payloads do not contain proofs because a piece of state can exist under many different state roots. All payloads can still be proved to be the correct requested data, however, it is the responsibility of the requesting party to anchor the returned data as canonical chain state data.

Helper Data Types

Paths (Nibbles)

A naive approach to storage of trie nodes would be to simply use the node_hash value of the trie node for storage. This scheme however results in stored data not being tied in any direct way to it's location in the trie. In a situation where a participant in the DHT wished to re-gossip data that they have stored, they would need to reconstruct a valid trie proof for that data in order to construct the appropriate OFFER/ACCEPT payload. We include the path metadata in state network content keys so that it is possible to reconstruct this proof.

We define path as a sequences of "nibbles" which represent the path through the merkle patritia trie (MPT) to reach the trie node.

nibble     := {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f}
Nibbles    := ByteList(33)

Because each nibble can be expressed using 4 bits, we pack two nibbles in one byte. Leading nibble will occupy higher bits and following nibble will occupy lower bits.

This encoding can introduce ambiguity (e.g. it's not possible to distinguish between single nibble [1] and nibbles [0, 1], because both are expressed as 0x01). To prevent this, we are going to use highest 4 bits of the first byte to specify whether length is even or odd:

  • bits 0000 - Number of nibbles is even. The 4 lowest bits of the first byte MUST be set to 0 (resulting that the value of the first byte is 0x00).
  • bits 0001 - Number of nibbles is odd. The first nibble MUST be stored in the 4 lowest bits of the first byte.

All remaining nibbles are packed in pairs of two and added to the first byte.

Examples:

[]              -> [0x00]
[0]             -> [0x10]
[1]             -> [0x11]
[0, 1]          -> [0x00, 0x01]
[1, 2, a, b]    -> [0x00, 0x12, 0xab]
[1, 2, a, b, c] -> [0x11, 0x2a, 0xbc]
Merkle Patricia Trie (MPT) Proofs

Merkle Patricia Trie (MPT) proofs consist of a list of TrieNode objects that correspond to individual trie nodes from the MPT. Each node can be one of the different node types from the MPT (e.g. blank, branch, extension, leaf). When serialized, each TrieNode is represented as an RLP serialized list of the component elements. The largest possible node type is the branch node, which should be up to 532 bytes (16 child nodes of Bytes32 with extra encoding info), but we specify 1024 bytes to be on the safe side.

TrieNode   := ByteList(1024)
TrieProof  := List(TrieNode, max_length=65)

The TrieProof type is used for both Account Trie and Contract Storage trie.

All TrieProof objects are subject to the following validity requirements.

  • A: Lexical Ordering
  • B: No Extraneous Nodes
A: Lexical Ordering

The sequence of nodes in the proof MUST represent the unbroken path in a trie, starting from the root node and each node being the child of its predecesor. This results in the state root node always occuring first and node being proven last in the list of trie nodes.

This validity condition is to ensure that verifcation of the proof can be done in a single pass.

B: No Extraneous Nodes

A proof MUST NOT contain any nodes that are not part of the set needed for proving.

This validity condition is to protect against malicious or erroneous bloating of proof payloads.

Account Trie Node

These data types represent a node from the main state trie.

account_trie_node_key  := Container(path: Nibbles, node_hash: Bytes32)
selector               := 0x20

content_key            := selector + SSZ.serialize(account_trie_node_key)

If node is an extension or leaf node (not branch node), than the path field MUST NOT include the same nibbles that are stored inside that node.

Account Trie Node: OFFER/ACCEPT

This type MUST be used when content offered via OFFER/ACCEPT.

content_for_offer      := Container(proof: TrieProof, block_hash: Bytes32)

The proof field MUST contain the proof for the trie node whose position in the trie and hash are specified in the content-key. The proof MUST be anchored to the block specified by the block_hash field.

Account Trie Node: FINDCONTENT/FOUNDCONTENT

This type MUST be used when content retrieved from another node via FINDCONTENT/FOUNDCONTENT.

content_for_retrieval := Container(node: TrieNode)

Contract Trie Node

These data types represent a node from an individual contract storage trie.

storage_trie_node_key  := Container(address: Address, path: Nibbles, node_hash: Bytes32)
selector               := 0x21

content_key            := selector + SSZ.serialize(storage_trie_node_key)

If node is an extension or leaf node (not branch node), than the path field MUST NOT include the same nibbles that are stored inside that node.

Contract Trie Node: OFFER/ACCEPT

This type MUST be used when content offered via OFFER/ACCEPT.

content_for_offer      := Container(storage_proof: TrieProof, account_proof: TrieProof, block_hash: Bytes32)

The account_proof field MUST contain the proof for the account state specified by the address field in the key and it MUST be anchored to the block specified by the block_hash field.

The storage_proof field MUST contain the proof for the trie node whose position in the trie and hash are specified in the key and it MUST be anchored to the trie specified by the account state.

Contract Trie Node: FINDCONTENT/FOUNDCONTENT

This type MUST be used when content retrieved from another node via FINDCONTENT/FOUNDCONTENT.

content_for_retrieval  := Container(node: TrieNode)

Contract Code

These data types represent the bytecode for a contract.

NOTE: Because CREATE2 opcode allows for redeployment of new code at an existing address, we MUST randomly distribute contract code storage across the DHT keyspace to avoid hotspots developing in the network for any contract that has had many different code deployments. Were we to use the path based high-bits approach for computing the content-id, it would be possible for a single location in the network to accumulate a large number of contract code objects that all live in roughly the same space. Problematic!

contract_code_key      := Container(address: Address, code_hash: Bytes32)
selector               := 0x22

content_key            := selector + SSZ.serialize(contract_code_key)
Contract Code: OFFER/ACCEPT

This types MUST be used when content offered via OFFER/ACCEPT.

content_for_offer      := Container(code: ByteList(32768), account_proof: TrieProof, block_hash: Bytes32)

The account_proof field MUST contain the proof for the account state specified by the address field in the key and it MUST be anchored to the block specified by the block_hash field.

Contract Code: FINDCONTENT/FOUNDCONTENT

This type MUST be used when content retrieved from another node via FINDCONTENT/FOUNDCONTENT.

content_for_retrieval  := Container(code: ByteList(32768))

Gossip

As each new block is added to the chain, the updated state from that block must be gossiped into the network. The state network defines a specific gossip algorithm which is referred to as "Recursive Gossip". This section of the specification defines how this gossip mechanism works.

Terminology

The Merkle Patricia Trie (MPT) has three types of nodes: "branch", "extension" and "leaf". The MPT also specifies the nil node, but it will never be sent or stored over network, so we will ignore it for this spec.

Similarly to other tree structure, the leaf node is the lowest node on a certain path and it's where value is stored in the tree (strictly speaking, MPT allows value to be stored in branch nodes as well, but Ethreum storage doesn't use this functionality). The branch and extension nodes can be called intermediate nodes because there will always be a node that can only be reached by passing through them.

A "merkle proof" or "proof" is a collection of nodes from the trie sufficient to recompute the state root and prove that "target" node is part of the trie defined by that state root. A proof is:

  • "ordered"
    • the order of the nodes in the proof have to represent the path from root node to the target node
      • first node must be the root node, followed by zero or more intermediate nodes, ending with a target node
      • it should be provable that any non-first node is part of the preceding node
    • if root node is the target node, then the proof will only contain the root node
    • the target node can be of any type (branch, extension or leaf)
  • "minimal"
    • it contains only the nodes from on a path from the root node to the target node

Overview

We will give the overview of the process for the account trie. The same process should be applied for the storage trie as well.

The goal of the "recursive gossip" mechanism is to reduce the burden of responsibility placed on bridge nodes for injecting new state data into the network while simultaniously spreading the responsibility for gossiping new state data across the nodes in the network.

At each block we construct a proof for each new or modified value in the trie, which is stored in the leaf nodes. The bridge node would search the DHT for nodes that are interested in storing each leaf node and gossip the proof to those nodes.

For example, let this be the proof: [A, B, C, D, E, F], then A is the root note and F is the target node. This proof should be gossiped to nodes that are interested in storing node F.

The recipients of this gossip are then responsible for gossiping the penultimate node of the proof (parent of the target node, E in our example). To do so, they would strip the last (target) node from the proof, create content-key with E being the target node, search the DHT for nodes that are interested in newly created content-key and gossip the content with proof to them.

It should be highlighted that creation of the content-key also requires updating the path field by removing one or more (in case of an extension node) nibbles from the end.

This process repeats until it terminates at the state root, with the final round of gossip only containing the [A] which is the state root node of the trie.

Bridge Node Responsibilities

The bridge is responsible for creating and gossiping all following data and their proofs:

  • account trie data:
    • all of the new and modified account nodes from the state trie
  • contract storage trie data:
    • all of the new and modified storage slots from each modified contract storage trie
    • proof has to include the proof for the account trie that corresponds to the same contract
    • recursive gossip stops at the root of the storage trie
  • all contract bytecode for newly created contracts
    • proof has to include the proof for the account trie that corresponds to the same contract
    • recursive gossip doesn't happen in this case

A bridge should compute the content-id values for all proofs that are part of the initial round of recursive gossip. These proofs should be sorted by proximity to its own node-id. Beginning with the content that is closest to its own node-id it should proceed to GOSSIP each individual proof to nodes interested in that content.