Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(starknet_batcher): fill in ThinStateDiff fields #2997

Open
wants to merge 1 commit into
base: 12-31-chore_starknet_batcher_move_test_utils_to_top_of_file
Choose a base branch
from

Conversation

alonh5
Copy link
Collaborator

@alonh5 alonh5 commented Dec 29, 2024

No description provided.

@reviewable-StarkWare
Copy link

This change is Reviewable

Copy link
Collaborator Author

alonh5 commented Dec 29, 2024

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link
Contributor

@Yael-Starkware Yael-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @alonh5 and @dafnamatsry)


crates/starknet_batcher/src/block_builder.rs line 95 at r1 (raw file):

    pub fn state_diff(&self) -> ThinStateDiff {
        // TODO(Ayelet): Remove the clones.
        let deployed_contracts = self.commitment_state_diff.address_to_class_hash.clone();

@yair-starkware said this map should be sorted by address before writing.

Code quote:

self.commitment_state_diff.address_to_class_hash.clone();

crates/starknet_batcher/src/block_builder.rs line 97 at r1 (raw file):

        let deployed_contracts = self.commitment_state_diff.address_to_class_hash.clone();
        let storage_diffs = self.commitment_state_diff.storage_updates.clone();
        let declared_classes = self.commitment_state_diff.class_hash_to_compiled_class_hash.clone();

same, but sorted by class hash

Code quote:

self.commitment_state_diff.class_hash_to_compiled_class_hash.clone();

@yair-starkware
Copy link
Contributor

@ShahakShama please confirm that the p2p sync expects the data to be sorted

Copy link
Contributor

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @alonh5, @asmaastarkware, @dafnamatsry, @Itay-Tsabary-Starkware, and @matan-starkware)


a discussion (no related file):

Previously, yair-starkware (Yair) wrote…

@ShahakShama please confirm that the p2p sync expects the data to be sorted

No. We've added special logic to handle non sorted data so you can send it not sorted.
I don't know how consensus expects it though. @matan-starkware @asmaastarkware ?

We did this because @Itay-Tsabary-Starkware can't guarantee the messages will be received sorted even if you send them sorted

Copy link
Contributor

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @alonh5, @asmaastarkware, @dafnamatsry, @Itay-Tsabary-Starkware, and @matan-starkware)


a discussion (no related file):

Previously, ShahakShama wrote…

No. We've added special logic to handle non sorted data so you can send it not sorted.
I don't know how consensus expects it though. @matan-starkware @asmaastarkware ?

We did this because @Itay-Tsabary-Starkware can't guarantee the messages will be received sorted even if you send them sorted

Note that we buffer non sorted data until we receive the missing hole so if you missed a block the pod will be out of memory

@yair-starkware
Copy link
Contributor

Previously, ShahakShama wrote…

Note that we buffer non sorted data until we receive the missing hole so if you missed a block the pod will be out of memory

Thanks.
I still think we need to sort the data to calculate the commitments (and IIRC the storage also requires the data to be sorted).
@alonh5

@alonh5 alonh5 force-pushed the 12-29-feat_starknet_batcher_fill_in_thinstatediff_fields branch from 3e0f59d to 976a5b4 Compare December 29, 2024 14:47
Copy link
Collaborator Author

@alonh5 alonh5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 3 unresolved discussions (waiting on @asmaastarkware, @dafnamatsry, @Itay-Tsabary-Starkware, @matan-starkware, @Yael-Starkware, and @yair-starkware)


a discussion (no related file):

Previously, yair-starkware (Yair) wrote…

Thanks.
I still think we need to sort the data to calculate the commitments (and IIRC the storage also requires the data to be sorted).
@alonh5

I think you're talking about two different orders. @ShahakShama is talking about the messages sent being sorted by block number, and @yair-starkware about the internal fields deployed_contracts and declared_classes being sorted.

The mappings are sorted when calculating the hashes (as they should be), so no need to sort. Added a test for that.


crates/starknet_batcher/src/block_builder.rs line 95 at r1 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

@yair-starkware said this map should be sorted by address before writing.

See the comment above.


crates/starknet_batcher/src/block_builder.rs line 97 at r1 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

same, but sorted by class hash

Same.

@yair-starkware
Copy link
Contributor

Previously, alonh5 (Alon Haramati) wrote…

I think you're talking about two different orders. @ShahakShama is talking about the messages sent being sorted by block number, and @yair-starkware about the internal fields deployed_contracts and declared_classes being sorted.

The mappings are sorted when calculating the hashes (as they should be), so no need to sort. Added a test for that.

Need to check in the papyrus storage writers (state, classes & casm) if the data is expected to be sorted

@alonh5 alonh5 force-pushed the 12-29-feat_starknet_batcher_fill_in_thinstatediff_fields branch from 976a5b4 to 41b6702 Compare December 31, 2024 08:04
@alonh5 alonh5 marked this pull request as draft December 31, 2024 08:05
@alonh5 alonh5 force-pushed the 12-29-feat_starknet_batcher_fill_in_thinstatediff_fields branch from 41b6702 to 7ae40d0 Compare December 31, 2024 09:46
@alonh5 alonh5 changed the base branch from main to 12-31-chore_starknet_batcher_move_test_utils_to_top_of_file December 31, 2024 09:46
@alonh5 alonh5 force-pushed the 12-29-feat_starknet_batcher_fill_in_thinstatediff_fields branch from 7ae40d0 to 5d7e8e6 Compare December 31, 2024 10:09
@alonh5 alonh5 marked this pull request as ready for review December 31, 2024 10:09
@alonh5 alonh5 force-pushed the 12-29-feat_starknet_batcher_fill_in_thinstatediff_fields branch from 5d7e8e6 to dae16c1 Compare December 31, 2024 10:34
Copy link
Contributor

@Yael-Starkware Yael-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r3, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @alonh5, @asmaastarkware, @dafnamatsry, @Itay-Tsabary-Starkware, and @matan-starkware)


crates/starknet_batcher/src/batcher_test.rs line 62 at r3 (raw file):

            &BlockExecutionArtifacts::create_for_testing().state_diff,
        ),
    }

I think the commitment can just be a constant for testing purposes , no?

Code quote:

    ProposalCommitment {
        state_diff_commitment: calculate_state_diff_hash(
            &BlockExecutionArtifacts::create_for_testing().state_diff,
        ),
    }

crates/starknet_batcher/src/batcher_test.rs line 574 at r3 (raw file):

    let address_to_nonce = HashMap::from_iter(
        expected_artifacts.state_diff.nonces.iter().map(|(address, nonce)| (*address, *nonce)),
    );

I think a helper function indexmap_to_hashmap() could be useful here and in the batcher code.

Code quote:

    let address_to_nonce = HashMap::from_iter(
        expected_artifacts.state_diff.nonces.iter().map(|(address, nonce)| (*address, *nonce)),
    );

crates/starknet_batcher/src/batcher_test.rs line 612 at r3 (raw file):

        batcher.decision_reached(DecisionReachedInput { proposal_id: PROPOSAL_ID }).await.unwrap();
    assert_eq!(response.state_diff, expected_artifacts.state_diff);
    assert!(response.state_diff.storage_diffs.keys().collect::<Vec<_>>().is_sorted());

do we want to check that the other fields are sorted as well?

Code quote:

assert!(response.state_diff.storage_diffs.keys().collect::<Vec<_>>().is_sorted());

crates/starknet_batcher/src/batcher.rs line 422 at r3 (raw file):

        let BlockExecutionArtifacts { l2_gas_used, execution_infos, mut state_diff, .. } =
            block_execution_artifacts;
        self.finalize_state_diff(&mut state_diff)?;

why not run this logic in into_state_diff and get the final_state_diff to begin with?
is there any reason to store the intermediate presentation?

Code quote:

self.finalize_state_diff(&mut state_diff)?;

crates/starknet_batcher/src/batcher.rs line 426 at r3 (raw file):

            height,
            state_diff.clone(),
            HashMap::from_iter(state_diff.nonces.iter().map(|(address, nonce)| (*address, *nonce))),

if we already pass the state_diff here, why do we need the address_to_nonce which is calculated from the state_diff?

Code quote:

HashMap::from_iter(state_diff.nonces.iter().map(|(address, nonce)| (*address, *nonce))),

crates/starknet_batcher/src/batcher.rs line 440 at r3 (raw file):

        for (address, class_hash) in state_diff.deployed_contracts.clone() {
            if self.storage_reader.address_is_assigned(address).map_err(|err| {
                error!("Failed to get class hash from storage: {}", err);

I think it's better to emphasize that the function failed, and not that we failed to get the class hash - since this scenario is possible when the address is not deployed.

Suggestion:

error!("Get class hash from storage failed: {}", err);

crates/starknet_batcher/src/batcher.rs line 454 at r3 (raw file):

        state_diff.storage_diffs.sort_keys();
        state_diff.declared_classes.sort_keys();
        state_diff.nonces.sort_keys();

what was @yair-starkware 's conclusion regarding the sorting requirement?

Code quote:

        deployed_contracts.sort_keys();
        replaced_classes.sort_keys();
        state_diff.deployed_contracts = deployed_contracts;
        state_diff.replaced_classes = replaced_classes;
        state_diff.storage_diffs.sort_keys();
        state_diff.declared_classes.sort_keys();
        state_diff.nonces.sort_keys();

crates/starknet_batcher/src/batcher.rs line 463 at r3 (raw file):

        state_diff: ThinStateDiff,
        address_to_nonce: HashMap<ContractAddress, Nonce>,
        tx_hashes: HashSet<TransactionHash>,

not that the HashSet does not keep the insertion ordering . better to use Vec.
I'm not sure it's a problem here since the HashSet is only used for the mempool.
But the trace could be a bit confusing.

maybe better to pass a Vec as input and then convert it only for the mempool call.

Code quote:

tx_hashes: HashSet<TransactionHash>,

crates/starknet_batcher/src/block_builder.rs line 70 at r3 (raw file):

pub struct BlockExecutionArtifacts {
    pub execution_infos: IndexMap<TransactionHash, TransactionExecutionInfo>,
    pub state_diff: ThinStateDiff,

Sweet!

Code quote:

pub state_diff: ThinStateDiff,

Copy link
Contributor

@Yael-Starkware Yael-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @alonh5, @asmaastarkware, @dafnamatsry, @Itay-Tsabary-Starkware, and @matan-starkware)


crates/starknet_batcher/src/batcher.rs line 463 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

not that the HashSet does not keep the insertion ordering . better to use Vec.
I'm not sure it's a problem here since the HashSet is only used for the mempool.
But the trace could be a bit confusing.

maybe better to pass a Vec as input and then convert it only for the mempool call.

*note

@alonh5 alonh5 force-pushed the 12-31-chore_starknet_batcher_move_test_utils_to_top_of_file branch from 96ac2a6 to c14be6e Compare January 1, 2025 15:48
Copy link
Collaborator Author

@alonh5 alonh5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @dafnamatsry, @Yael-Starkware, and @yair-starkware)


crates/starknet_batcher/src/batcher.rs line 422 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

why not run this logic in into_state_diff and get the final_state_diff to begin with?
is there any reason to store the intermediate presentation?

Because I need the storage for this, and I prefered not to pass a reader into the the block builder.


crates/starknet_batcher/src/batcher.rs line 426 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

if we already pass the state_diff here, why do we need the address_to_nonce which is calculated from the state_diff?

Done. Thanks.


crates/starknet_batcher/src/batcher.rs line 440 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

I think it's better to emphasize that the function failed, and not that we failed to get the class hash - since this scenario is possible when the address is not deployed.

Done.


crates/starknet_batcher/src/batcher.rs line 454 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

what was @yair-starkware 's conclusion regarding the sorting requirement?

It looks like they are supposed to be sorted but I still need to verify for sure.


crates/starknet_batcher/src/batcher.rs line 463 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

*note

I started doing this but then noticed we are collection them in decision reached from the execution info so this adds another iteration over all the tx hashes so I don't think it's important enough to do that.


crates/starknet_batcher/src/batcher_test.rs line 62 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

I think the commitment can just be a constant for testing purposes , no?

It's tested against the one returned from the batcher so it needs to be the one that s received from the test BlockExecutionArtifacts.


crates/starknet_batcher/src/batcher_test.rs line 574 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

I think a helper function indexmap_to_hashmap() could be useful here and in the batcher code.

I thought about it, but it's just one line I don't want to overcomplicate. I did make it nicer with collect() in both locations.


crates/starknet_batcher/src/batcher_test.rs line 612 at r3 (raw file):

Previously, Yael-Starkware (YaelD) wrote…

do we want to check that the other fields are sorted as well?

That means filling all the fields in the state diff, wdyt? maybe in a separate test just for that with a full state diff?

@alonh5 alonh5 force-pushed the 12-29-feat_starknet_batcher_fill_in_thinstatediff_fields branch from dae16c1 to c4d7b49 Compare January 1, 2025 15:48
Copy link
Contributor

@Yael-Starkware Yael-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @alonh5, @dafnamatsry, and @yair-starkware)


crates/starknet_batcher/src/batcher.rs line 422 at r3 (raw file):

Previously, alonh5 (Alon Haramati) wrote…

Because I need the storage for this, and I prefered not to pass a reader into the the block builder.

got it,
To make it clear, so that others won't use it the wrong way, lets change the name in BlockExecutionArtifacts to reflect that it is not a final state_diff , maybe something like unsorted_state_diff or raw_state_diff.


crates/starknet_batcher/src/batcher_test.rs line 612 at r3 (raw file):

Previously, alonh5 (Alon Haramati) wrote…

That means filling all the fields in the state diff, wdyt? maybe in a separate test just for that with a full state diff?

Ok, makes sense.

Copy link
Contributor

@yair-starkware yair-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 6 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @alonh5, @dafnamatsry, and @Yael-Starkware)


crates/starknet_batcher/src/batcher_test.rs line 63 at r4 (raw file):

        ),
    }
}

Can this be a constant?

Code quote:

fn proposal_commitment() -> ProposalCommitment {
    ProposalCommitment {
        state_diff_commitment: calculate_state_diff_hash(
            &BlockExecutionArtifacts::create_for_testing().state_diff,
        ),
    }
}

crates/starknet_batcher/src/batcher.rs line 445 at r4 (raw file):

                deployed_contracts.insert(address, ClassHash(class_hash.0));
            }
        }

Consider extracting to a function

Code quote:

        for (address, class_hash) in state_diff.deployed_contracts.clone() {
            if self.storage_reader.address_is_assigned(address).map_err(|err| {
                error!("Get class hash from storage failed: {}", err);
                BatcherError::InternalError
            })? {
                replaced_classes.insert(address, ClassHash(class_hash.0));
            } else {
                deployed_contracts.insert(address, ClassHash(class_hash.0));
            }
        }

crates/starknet_batcher/src/batcher.rs line 446 at r4 (raw file):

            }
        }
        deployed_contracts.sort_keys();

Consider extracting all the sorting to a function


crates/starknet_batcher/src/batcher.rs line 450 at r4 (raw file):

        state_diff.deployed_contracts = deployed_contracts;
        state_diff.replaced_classes = replaced_classes;
        state_diff.storage_diffs.sort_keys();

Also the storage diffs of each contract need to be sorted

Code quote:

state_diff.storage_diffs.sort_keys();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants