Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] Partial unfold during commitment #11350

Closed
wants to merge 39 commits into from
Closed

[wip] Partial unfold during commitment #11350

wants to merge 39 commits into from

Conversation

awskii
Copy link
Member

@awskii awskii commented Jul 26, 2024

@awskii awskii changed the title Partial unfold during commitment [wip] Partial unfold during commitment Jul 26, 2024
@awskii
Copy link
Member Author

awskii commented Jul 31, 2024

so i figured out account reducing but did not fully figured storage hash reuse

@awskii
Copy link
Member Author

awskii commented Jul 31, 2024

working version for account 141d862

awskii and others added 7 commits August 2, 2024 13:31
Added debug env `KV_READ_METRICS=false`. When it's true, we gather
Summaries of:
```
kv_get{level="L0"`}
kv_get{level="L1"`}
kv_get{level="L2"`}
kv_get{level="L3"`}
kv_get{level="L4"`}
kv_get{level="recent"}

its a first step to #10691.
Have to add per-domain metrics as well.
I will cover grafana dashboard a bit later.
@awskii
Copy link
Member Author

awskii commented Aug 5, 2024

latest working f226ae6, merge main breaks existing node commitmetn

@awskii awskii mentioned this pull request Aug 9, 2024
2 tasks
@awskii awskii closed this Aug 9, 2024
Giulio2002 added a commit that referenced this pull request Oct 21, 2024
take2 on #11536 #11350 

Partial unfold means that we want to skip unfolding of nodes when it's
possible to do so.
 
Node state for account represented by fields [balance/nonce/codeHash]
and for storage by it's value.
By default node is not loaded with state. During folding, if needed,
state parts are loaded.

Skip is possible in following cases (having memoised state hash already
exists):
 - no changes to node state (balance/nonce/code/storage was not changed)
 - node preserves it's position in trie: keeps depth, keeps nibble
- no subtree updates (eg. no storage updates for account or no extension
split due to insertion)

There is still a Q which state hashes are worth keeping and which are
not. In bor experiments we've seen that top-level hashes of storages are
changing quite frequently, as well as hashes of accounts with lot's of
storage keys in it.

This PR totally reworks mechanism of rebuilding commitment files when
all other domains exists and have same ranges.

- Processes big shards with smaller parts (eg 0-1024.kv is 8 shards by
128 steps)
- shards become merged when range is finished (from example above, steps
will be merged into 0-1024, when all 8 shards are ready (madv friendly
way))
- does not write commitment into db, dumps file directly on disk

If cancelled during rebuilt, have to remove all shards which doesn't
have matching range to other domains.
In example, `accounts` are [0-1024, 1024-1536, ..] and `commitment` are
[0-1024, 1024-1152..] you have to do **remove** commitment files of
steps > 1024 (their range does not match to account 1024-1536, rebuilt
will be incorrect and there is no protection yet for that case.

To rebuild commitment:
```
$ make integration && ./build/bin/integration commitment_rebuild --datadir <> --chain <>
```

To be done:
- [x] squeezing commitment
- [ ] move this subfunction into `integration commitment` along with
analysis tool for commitment files.

---------

Co-authored-by: alex.sharov <[email protected]>
Co-authored-by: Giulio <[email protected]>
Co-authored-by: Mark Holt <[email protected]>
@awskii awskii deleted the part-unfold branch November 25, 2024 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants