Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(state): add cache to EvmBlockState #1443

Merged
merged 1 commit into from
Sep 15, 2024

Conversation

morph-dev
Copy link
Collaborator

This is built on top of #1442, which should be merged before this one.

What was wrong?

The EvmBlockState would traverse entire state trie for every piece of state data that it needs. This really affects the performance of the contract storage trie look ups (as we would have to traverse account trie first, in order to find storage root). Even account trie lookups could benefit from not looking the same trie nodes over and over again.

How was it fixed?

Added cash for every state network key-value lookup. This isn't a problem because EvmBlockState is short lived (just while serving specific eth_* json rpc).

Potential improvement in the future is to create global cache, that can be reused across multiple requests. It would have to be limited either by time or count of entries.

We also keep track of code_hash->address_hash mapping, so that we can recover bytecode on demand (current implementation would fetch it together with the rest of the account state, which might not be necessary).

To-Do

@morph-dev morph-dev added the state network Issue related to portal state network label Sep 12, 2024
@morph-dev morph-dev added this to the Devcon 2024 milestone Sep 12, 2024
@morph-dev morph-dev requested a review from KolbyML September 12, 2024 17:03
@morph-dev morph-dev self-assigned this Sep 12, 2024
Copy link
Member

@KolbyML KolbyML left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some early thoughts

Comment on lines +97 to +103
if let Some(account_state) = &account_state {
if account_state.code_hash != KECCAK_EMPTY {
self.code_hash_to_address_hash
.entry(account_state.code_hash)
.or_insert(address_hash);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this cache code be in async fn basic_async(&mut self, address: Address) -> Result<Option<AccountInfo>, Self::Error> {

Multiple contracts can have the same bytecode, but since eth_call is read only? We are probably good. I am assuming eth_call can't run self-destruct or create bytecode or anything

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this cache code be in async fn basic_async(&mut self, address: Address) -> Result<Option<AccountInfo>, Self::Error> {

Considering that basic_async calls this function, I would say that this is better place (since every path that reads account state has to go through here).
But I can see the argument of being there as well (since this cache is only used by code_by_hash_async). I would leave it here for now, but I can move it down if you have strong opinion on this.

Multiple contracts can have the same bytecode, but since eth_call is read only? We are probably good. I am assuming eth_call can't run self-destruct or create bytecode or anything

The way I see EvmBlockState: It's given a block header, and it fetches the state trie from portal network at that specific block. In the context of Evm and execution, it's the same as reading the state from a db.
So the entire evm execution is read-only from the db point of view (and state is only committed later).

With that being said, eth_call can both call self-destruct, create contracts, etc. but all of that is handled by evm. The EvmBlockState doesn't have to care about any of that.


Regarding multiple contracts having the same bytecode: in order to fetch it from the portal network, we only need to know one account that has it (regardless of which one it is, since they are identical). The reason we have address_hash in the content_key is to avoid hot-spot on portal network for the bytecode that is shared across many contracts (e.g. erc20 token, uniswap dex pool).

Copy link
Member

@KolbyML KolbyML left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@morph-dev morph-dev merged commit b08dcef into ethereum:master Sep 15, 2024
9 checks passed
@morph-dev morph-dev deleted the block_state_cache branch September 15, 2024 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state network Issue related to portal state network
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants