Skip to content

Commit

Permalink
lint
Browse files Browse the repository at this point in the history
  • Loading branch information
shreyan-gupta committed Nov 14, 2024
1 parent a7bbee7 commit 65ece27
Showing 1 changed file with 9 additions and 2 deletions.
11 changes: 9 additions & 2 deletions neps/nep-0568.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ MemTrie is the in-memory representation of the trie that the runtime uses for al
As of today it isn't mandatory for nodes to have MemTrie feature enabled but going forward, with ReshardingV3, all nodes would require to have MemTrie enabled for resharding to happen successfully.

For the purposes of resharding, we need an efficient way to split the MemTrie into two child tries based on the boundary account. This splitting happens at the epoch boundary when the new epoch is expected to have the two child shards. The set of requirements around MemTrie splitting are:

* MemTrie splitting needs to be "instant", i.e. happen efficiently within the span of one block. The child tries need to be available for the processing of the next block in the new epoch.
* MemTrie splitting needs to be compatible with stateless validation, i.e. we need to generate a proof that the memtrie split proposed by the chunk producer is correct.
* The proof generated for splitting the MemTrie needs to be compatible with the limits of the size of state witness that we send to all chunk validators. This prevents us from doing things like iterating through all trie keys for delayed receipts etc.
Expand All @@ -102,8 +103,10 @@ This category includes the trie keys `TrieKey::DelayedReceiptIndices`, `TrieKey:
This category includes the trie keys `TrieKey::DelayedReceipt`, `TrieKey::PromiseYieldTimeout` and `TrieKey::BufferedReceipt`. The number of entries for these keys can potentially be arbitrarily large and it's not feasible to iterate through all the entries. In pre-stateless validation world, where we didn't care about state witness size limits, for ReshardingV2 we could just iterate over all delayed receipts and split them into the respective child shards.

For ReshardingV3, these are handled by either of the two strategies
- `TrieKey::DelayedReceipt` and `TrieKey::PromiseYieldTimeout` are handled by duplicating entries across both child shards as each entry could belong to either of the child shards. More details in the [Delayed Receipts](#delayed-receipt-handling) and [Promise Yield](#promiseyield-receipt-handling) sections below.
- `TrieKey::BufferedReceipt` are independent of the account_id and therefore can be sent to either of the child shards, but not both. We copy the buffered receipts and the associated metadata to the child shard with the lower index. More details in the [Buffered Receipts](#buffered-receipt-handling) section below.

* `TrieKey::DelayedReceipt` and `TrieKey::PromiseYieldTimeout` are handled by duplicating entries across both child shards as each entry could belong to either of the child shards. More details in the [Delayed Receipts](#delayed-receipt-handling) and [Promise Yield](#promiseyield-receipt-handling) sections below.

* `TrieKey::BufferedReceipt` are independent of the account_id and therefore can be sent to either of the child shards, but not both. We copy the buffered receipts and the associated metadata to the child shard with the lower index. More details in the [Buffered Receipts](#buffered-receipt-handling) section below.

### State Storage - Flat State

Expand Down Expand Up @@ -171,7 +174,9 @@ supporting smooth transitions without altering storage structures directly.
### Cross Shard Traffic

### Delayed Receipt Handling

### PromiseYield Receipt Handling

### Buffered Receipt Handling

### ShardId Semantics
Expand All @@ -193,6 +198,7 @@ In this NEP, we propose updating the ShardId semantics to allow for arbitrary id
The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work.]
```

### State Storage - MemTrie

The current implementation of MemTrie uses a pool of memory (`STArena`) to allocate and deallocate nodes and internal pointers in this pool to reference child nodes. MemTries, unlike the State representation of Trie, do not work with the hash of the nodes but internal memory pointers directly. Additionally, MemTries are not thread safe and one MemTrie exists per shard.
Expand All @@ -210,6 +216,7 @@ While Frozen MemTries provide the benefits of being compatible with instant resh
Additionally, a node would have to support 2x the memory footprint of a single trie as after resharding, we would have two copies of the trie in memory, one from the temporary Hybrid MemTrie in use for block production, and other from the background MemTrie that would be under construction. Once the background MemTrie is fully constructed and caught up with the latest block, we do an in-place swap of the Hybrid MemTrie with the new child MemTrie and deallocate the memory from the Hybrid MemTrie.

During a resharding event, at the boundary of the epoch, when we need to split the parent shard into the two child shards, we do the following steps:

1. Freeze the parent MemTrie arena to create a read-only frozen arena that represents a snapshot of the state as of the time of freezing, i.e. after postprocessing last block of epoch. Note that we no longer require the parent MemTrie in runtime going forward.
2. We cheaply clone the Frozen MemTrie for both the child MemTries to use. Note that this doesn't clone the parent arena memory, but just increases the refcount.
3. We then create a new MemTrie with HybridArena for each of the children. The base of the MemTrie is the read-only FrozenArena while all new node allocations happens on a dedicated STArena memory pool for each child MemTrie. This is the temporary MemTrie that we use while Flat Storage is being built in the background.
Expand Down

0 comments on commit 65ece27

Please sign in to comment.