perf(continuations): Improve initializations and reduce redundant com…

…putations (#429) * [WIP] add account to linked list * Add account linked list and some unit tests * Modify tests for insetions and fix bugs * Fix deletion test * [WIP] Getting trie data without leaves * [WIP] changes during the flights * Fix storage ll error * Add missing file * [WIP] checking consistency between linked list and state trie * [WIP] addr 0x798c6047767c10f653ca157a7f66a592a1d6ca550cae352912be0b0745336afd not found in linked list * [WIP] add missing nibbles to key * Fix accounts insertions * Check storage reads * Add segments to preinitialization * [WIP] Unreasonable offset * [WIP] Erc721 error * [WIP] uncomment ll reads * [WIP] Fixing iterator * Fix unit tests * Fix erc721 error * mend * Remove debug info * mend * [WIP] hashing * Provide constants non-deterministically * Update journaling * Update search_account and read_accounts_linked_lists * Constraint state and storage leaves * Update slot and account search methods in linked_lists and update revert_storage_change * Fix asm * Some fixes and debugging * Fix hash mismatch * Add functions for updating final trie * [WIP] eliminating trie call * [WIP] Fix kexit_data issue * Merge with journaling branch * [WIP] set final tries * Mutate storage payload on insertions * Fix deletions in final trie computation * Remove double hashing of the initial state trie * Misc * trace_decoder/Cargo.toml * Fix return from read_storage_linked_list_w_addr * [WIP] debugging unit tests * Fix set_payload_storage_extension and bring back final gas check * Fix failing shanghai blocks * Fix run_next_addresses_remove and some revertions * Some journaling fixes * FIx revert_storage_change and revert_account_created * Fix delete account * [WIP] fixing LoopCallsDepthThenRevert3_d0g0v0_Shanghai.json * Add an overwrite version of account insertion and use it for mpt_insert_state_trie * Fix final state trie hash mismatch * Fix insert_new_slot_with_value * Debugging variedContext_d10g0v0_Shanghai.json * Fix variedContext_d10g0v0_Shanghai * [WIP] Fixing vitalikTransactionTestParis_d0g0v0_Shanghai * Delete all associated slots when an account is deleted * Fix run_next_remove_address_slots * Copy initial accounts and slots * Fix find condition in run_next_remove_address_slots * Deep copy of accounts and slots * [WIP] Testing evm test suite * Fix most unit tests and fix revert_account_created * FIx linked list test and a bit of cleanup * [WIP] Debugging /stExample/basefeeExample_d0g0v0_Shanghai * Fix merge and add batch sizes to the benchmark * Fix test_only in zero_bin * Fix rlp pointers error * Remove panic * [WIP] Debugging InitCollisionParis_d2g0v0_Shanghai * [WIP] Debugging stCallCodes/callcallcodecallcode_011_SuicideEnd_d0g0v0_Shanghai * Fix remove_all_slots * Clean code * Fix account code initialize mpts * Apply comments and fix CI. * Improve remove_all_slots_loop * Fix stack comment * Misc, faster get_valid_slot_ptr * [WIP] Debugging erc20 * Remove counter update and cold_access * Fix log_opcode circuit sizes * Fix stack comment and remove ctr update * Fix preinitialization * Fix unit tests * Debugging stShift/shr01_d0g0v0_Shanghai * Fixing shiftSignedCombinations_d0g0v0_Shanghai * Remove counter assertions from linked_lists tests * Fix preinitialization and start cleanup * Fix test_process_receipt * Fix MPT insert tests * Add check in mpt_insert tests * Additional fixes * Pass preinitialized_segments when necessary * Fix all unit tests * Cleanup * More comments cleanup * Do not store memory content as vec * Prevent needless conversion * Remove needless copy * Remove needless checks * Use tuple_windows instead * Typo * Remove leftover * Remove unrelated changes * Fix clone_slot * Remove assert 0 from clone_slot * Fix next_node_ok_with_value * Apply comments and cleanup * Start * Do not recompute LLs * Remove unused preinit * Remove useless function * Rename for consistency * Cleanup * More cleanup * Minor * Lighten up initialization * Cleaner * Remove is_dummy * Remove dummy run * Fix next_node_ok_with_value and tiny cleanup * Simplify iterator * Tweak * Start addressing comments * Tweak * Mighty clippy * Revert num_procs * Remove unnecessary linked_lists methods and payload_ptr * Update .env file * Address comment, bring back txn hashes * Apply suggestion * Update sizes * Add explicit panic message --------- Co-authored-by: 4l0n50 <[email protected]> Co-authored-by: Hamy Ratoanina <[email protected]> Co-authored-by: Alonso Gonzalez <[email protected]> Co-authored-by: Linda Guiga <[email protected]>
0xPolygonZero · Aug 1, 2024 · b0cda17 · b0cda17
1 parent 5edb969
commit b0cda17
Show file tree

Hide file tree

Showing 14 changed files with 243 additions and 366 deletions.
diff --git a/.env b/.env
@@ -1,10 +1,10 @@
 AMQP_URI=amqp://localhost:5672
-ARITHMETIC_CIRCUIT_SIZE=16..23
-BYTE_PACKING_CIRCUIT_SIZE=9..21
-CPU_CIRCUIT_SIZE=12..25
-KECCAK_CIRCUIT_SIZE=14..20
-KECCAK_SPONGE_CIRCUIT_SIZE=9..15
-LOGIC_CIRCUIT_SIZE=12..18
-MEMORY_CIRCUIT_SIZE=17..28
-MEMORY_BEFORE_CIRCUIT_SIZE=7..23
-MEMORY_AFTER_CIRCUIT_SIZE=7..27
+ARITHMETIC_CIRCUIT_SIZE=16..21
+BYTE_PACKING_CIRCUIT_SIZE=8..21
+CPU_CIRCUIT_SIZE=8..21
+KECCAK_CIRCUIT_SIZE=4..20
+KECCAK_SPONGE_CIRCUIT_SIZE=8..17
+LOGIC_CIRCUIT_SIZE=4..21
+MEMORY_CIRCUIT_SIZE=17..24
+MEMORY_BEFORE_CIRCUIT_SIZE=16..23
+MEMORY_AFTER_CIRCUIT_SIZE=7..23
diff --git a/evm_arithmetization/src/cpu/kernel/interpreter.rs b/evm_arithmetization/src/cpu/kernel/interpreter.rs
@@ -67,8 +67,6 @@ pub(crate) struct Interpreter<F: Field> {
     pub(crate) clock: usize,
     /// Log of the maximal number of CPU cycles in one segment execution.
     max_cpu_len_log: Option<usize>,
-    /// Indicates whethere this is a dummy run.
-    is_dummy: bool,
 }
 
 /// Simulates the CPU execution from `state` until the program counter reaches
@@ -168,22 +166,6 @@ impl<F: Field> Interpreter<F> {
         result
     }
 
-    /// Returns an instance of `Interpreter` given `GenerationInputs`, and
-    /// assuming we are initializing with the `KERNEL` code.
-    pub(crate) fn new_dummy_with_generation_inputs(
-        initial_offset: usize,
-        initial_stack: Vec<U256>,
-        inputs: &GenerationInputs,
-    ) -> Self {
-        debug_inputs(inputs);
-
-        let max_cpu_len = Some(NUM_EXTRA_CYCLES_BEFORE + NUM_EXTRA_CYCLES_AFTER);
-        let mut result =
-            Self::new_with_generation_inputs(initial_offset, initial_stack, inputs, max_cpu_len);
-        result.is_dummy = true;
-        result
-    }
-
     pub(crate) fn new(
         initial_offset: usize,
         initial_stack: Vec<U256>,
@@ -201,7 +183,6 @@ impl<F: Field> Interpreter<F> {
             is_jumpdest_analysis: false,
             clock: 0,
             max_cpu_len_log,
-            is_dummy: false,
         };
         interpreter.generation_state.registers.program_counter = initial_offset;
         let initial_stack_len = initial_stack.len();
@@ -233,7 +214,6 @@ impl<F: Field> Interpreter<F> {
             is_jumpdest_analysis: true,
             clock: 0,
             max_cpu_len_log,
-            is_dummy: false,
         }
     }
 
@@ -641,10 +621,10 @@ impl<F: Field> State<F> for Interpreter<F> {
                 memory_state.contexts[ctx_idx] = ctx.clone();
             }
         }
-        if self.generation_state.set_preinit {
-            memory_state.preinitialized_segments =
-                self.generation_state.memory.preinitialized_segments.clone();
-        }
+
+        memory_state.preinitialized_segments =
+            self.generation_state.memory.preinitialized_segments.clone();
+
         Some(memory_state)
     }
 

diff --git a/evm_arithmetization/src/fixed_recursive_verifier.rs b/evm_arithmetization/src/fixed_recursive_verifier.rs
@@ -41,18 +41,16 @@ use starky::stark::Stark;
 
 use crate::all_stark::{all_cross_table_lookups, AllStark, Table, NUM_TABLES};
 use crate::cpu::kernel::aggregator::KERNEL;
-use crate::generation::GenerationInputs;
+use crate::generation::{GenerationInputs, TrimmedGenerationInputs};
 use crate::get_challenges::observe_public_values_target;
 use crate::memory::segments::Segment;
 use crate::proof::{
     AllProof, BlockHashesTarget, BlockMetadataTarget, ExtraBlockData, ExtraBlockDataTarget,
     FinalPublicValues, MemCapTarget, PublicValues, PublicValuesTarget, RegistersDataTarget,
     TrieRoots, TrieRootsTarget, TARGET_HASH_SIZE,
 };
-use crate::prover::{
-    check_abort_signal, generate_all_data_segments, prove, GenerationSegmentData,
-    SegmentDataIterator,
-};
+use crate::prover::testing::prove_all_segments;
+use crate::prover::{check_abort_signal, prove, GenerationSegmentData, SegmentDataIterator};
 use crate::recursive_verifier::{
     add_common_recursion_gates, add_virtual_public_values, get_memory_extra_looking_sum_circuit,
     recursive_stark_circuit, set_public_value_targets, PlonkWrapperCircuit, PublicInputs,
@@ -1485,7 +1483,7 @@ where
         &self,
         all_stark: &AllStark<F, D>,
         config: &StarkConfig,
-        generation_inputs: GenerationInputs,
+        generation_inputs: TrimmedGenerationInputs,
         segment_data: &mut GenerationSegmentData,
         timing: &mut TimingTree,
         abort_signal: Option<Arc<AtomicBool>>,
@@ -1563,20 +1561,16 @@ where
         timing: &mut TimingTree,
         abort_signal: Option<Arc<AtomicBool>>,
     ) -> anyhow::Result<Vec<ProverOutputData<F, C, D>>> {
-        println!("Entering prove all segments");
-        let mut it_segment_data = SegmentDataIterator {
-            inputs: &generation_inputs,
-            partial_next_data: None,
-            max_cpu_len_log: Some(max_cpu_len_log),
-        };
+        let mut segment_iterator =
+            SegmentDataIterator::<F>::new(&generation_inputs, Some(max_cpu_len_log));
 
         let mut proofs = vec![];
 
-        for mut next_data in it_segment_data {
+        for mut next_data in segment_iterator {
             let proof = self.prove_segment(
                 all_stark,
                 config,
-                generation_inputs.clone(),
+                generation_inputs.trim(),
                 &mut next_data.1,
                 timing,
                 abort_signal.clone(),

diff --git a/evm_arithmetization/src/generation/mod.rs b/evm_arithmetization/src/generation/mod.rs
@@ -2,6 +2,7 @@ use std::collections::HashMap;
 
 use anyhow::anyhow;
 use ethereum_types::{Address, BigEndianHash, H256, U256};
+use keccak_hash::keccak;
 use log::log_enabled;
 use mpt_trie::partial_trie::{HashedPartialTrie, PartialTrie};
 use plonky2::field::extension::Extendable;
@@ -94,40 +95,43 @@ pub struct GenerationInputs {
 /// A lighter version of [`GenerationInputs`], which have been trimmed
 /// post pre-initialization processing.
 #[derive(Clone, Debug, Deserialize, Serialize, Default)]
-pub(crate) struct TrimmedGenerationInputs {
-    pub(crate) trimmed_tries: TrimmedTrieInputs,
-    /// The index of the transaction being proven within its block.
-    pub(crate) txn_number_before: U256,
+pub struct TrimmedGenerationInputs {
+    pub trimmed_tries: TrimmedTrieInputs,
+    /// The index of the first transaction in this payload being proven within
+    /// its block.
+    pub txn_number_before: U256,
     /// The cumulative gas used through the execution of all transactions prior
-    /// the current one.
-    pub(crate) gas_used_before: U256,
-    /// The cumulative gas used after the execution of the current transaction.
-    /// The exact gas used by the current transaction is `gas_used_after` -
-    /// `gas_used_before`.
-    pub(crate) gas_used_after: U256,
+    /// the current ones.
+    pub gas_used_before: U256,
+    /// The cumulative gas used after the execution of the current batch of
+    /// transactions. The exact gas used by the current batch of transactions
+    /// is `gas_used_after` - `gas_used_before`.
+    pub gas_used_after: U256,
 
-    /// Indicates whether there is an actual transaction or a dummy payload.
-    pub(crate) txns_len: usize,
+    /// The list of txn hashes contained in this batch.
+    pub txn_hashes: Vec<H256>,
 
-    /// Expected trie roots after the transactions are executed.
-    pub(crate) trie_roots_after: TrieRoots,
+    /// Expected trie roots before these transactions are executed.
+    pub trie_roots_before: TrieRoots,
+    /// Expected trie roots after these transactions are executed.
+    pub trie_roots_after: TrieRoots,
 
     /// State trie root of the checkpoint block.
     /// This could always be the genesis block of the chain, but it allows a
     /// prover to continue proving blocks from certain checkpoint heights
     /// without requiring proofs for blocks past this checkpoint.
-    pub(crate) checkpoint_state_trie_root: H256,
+    pub checkpoint_state_trie_root: H256,
 
     /// Mapping between smart contract code hashes and the contract byte code.
     /// All account smart contracts that are invoked will have an entry present.
-    pub(crate) contract_code: HashMap<H256, Vec<u8>>,
+    pub contract_code: HashMap<H256, Vec<u8>>,
 
     /// Information contained in the block header.
-    pub(crate) block_metadata: BlockMetadata,
+    pub block_metadata: BlockMetadata,
 
     /// The hash of the current block, and a list of the 256 previous block
     /// hashes.
-    pub(crate) block_hashes: BlockHashes,
+    pub block_hashes: BlockHashes,
 }
 
 #[derive(Clone, Debug, Deserialize, Serialize, Default)]
@@ -178,12 +182,23 @@ impl GenerationInputs {
     /// the fields that have already been processed during pre-initialization,
     /// namely: the input tries, the signed transaction, and the withdrawals.
     pub(crate) fn trim(&self) -> TrimmedGenerationInputs {
+        let txn_hashes = self
+            .signed_txns
+            .iter()
+            .map(|tx_bytes| keccak(&tx_bytes[..]))
+            .collect();
+
         TrimmedGenerationInputs {
             trimmed_tries: self.tries.trim(),
             txn_number_before: self.txn_number_before,
             gas_used_before: self.gas_used_before,
             gas_used_after: self.gas_used_after,
-            txns_len: self.signed_txns.len(),
+            txn_hashes,
+            trie_roots_before: TrieRoots {
+                state_root: self.tries.state_trie.hash(),
+                transactions_root: self.tries.transactions_trie.hash(),
+                receipts_root: self.tries.receipts_trie.hash(),
+            },
             trie_roots_after: self.trie_roots_after.clone(),
             checkpoint_state_trie_root: self.checkpoint_state_trie_root,
             contract_code: self.contract_code.clone(),
@@ -195,12 +210,11 @@ impl GenerationInputs {
 
 fn apply_metadata_and_tries_memops<F: RichField + Extendable<D>, const D: usize>(
     state: &mut GenerationState<F>,
-    inputs: &GenerationInputs,
+    inputs: &TrimmedGenerationInputs,
     registers_before: &RegistersData,
     registers_after: &RegistersData,
 ) {
     let metadata = &inputs.block_metadata;
-    let tries = &inputs.tries;
     let trie_roots_after = &inputs.trie_roots_after;
     let fields = [
         (
@@ -227,19 +241,19 @@ fn apply_metadata_and_tries_memops<F: RichField + Extendable<D>, const D: usize>
         (GlobalMetadata::TxnNumberBefore, inputs.txn_number_before),
         (
             GlobalMetadata::TxnNumberAfter,
-            inputs.txn_number_before + inputs.signed_txns.len(),
+            inputs.txn_number_before + inputs.txn_hashes.len(),
         ),
         (
             GlobalMetadata::StateTrieRootDigestBefore,
-            h2u(tries.state_trie.hash()),
+            h2u(inputs.trie_roots_before.state_root),
         ),
         (
             GlobalMetadata::TransactionTrieRootDigestBefore,
-            h2u(tries.transactions_trie.hash()),
+            h2u(inputs.trie_roots_before.transactions_root),
         ),
         (
             GlobalMetadata::ReceiptTrieRootDigestBefore,
-            h2u(tries.receipts_trie.hash()),
+            h2u(inputs.trie_roots_before.receipts_root),
         ),
         (
             GlobalMetadata::StateTrieRootDigestAfter,
@@ -390,18 +404,14 @@ fn get_all_memory_address_and_values(memory_before: &MemoryState) -> Vec<(Memory
 type TablesWithPVsAndFinalMem<F> = ([Vec<PolynomialValues<F>>; NUM_TABLES], PublicValues);
 pub fn generate_traces<F: RichField + Extendable<D>, const D: usize>(
     all_stark: &AllStark<F, D>,
-    inputs: &GenerationInputs,
+    inputs: &TrimmedGenerationInputs,
     config: &StarkConfig,
     segment_data: &mut GenerationSegmentData,
     timing: &mut TimingTree,
 ) -> anyhow::Result<TablesWithPVsAndFinalMem<F>> {
-    debug_inputs(inputs);
-
-    let mut state = GenerationState::<F>::new(inputs, &KERNEL.code)
+    let mut state = GenerationState::<F>::new_with_segment_data(inputs, segment_data)
         .map_err(|err| anyhow!("Failed to parse all the initial prover inputs: {:?}", err))?;
 
-    state.set_segment_data(segment_data);
-
     initialize_kernel_code_and_shift_table(&mut segment_data.memory);
 
     // Retrieve initial memory addresses and values.
@@ -410,20 +420,12 @@ pub fn generate_traces<F: RichField + Extendable<D>, const D: usize>(
     // Initialize the state with the one at the end of the
     // previous segment execution, if any.
     let GenerationSegmentData {
-        is_dummy,
-        set_preinit,
-        segment_index,
         max_cpu_len_log,
-        memory,
         registers_before,
         registers_after,
-        extra_data,
+        ..
     } = segment_data;
 
-    if segment_data.set_preinit {
-        state.memory.preinitialized_segments = segment_data.memory.preinitialized_segments.clone();
-    }
-
     for &(address, val) in &actual_mem_before {
         state.memory.set(address, val);
     }
@@ -435,7 +437,7 @@ pub fn generate_traces<F: RichField + Extendable<D>, const D: usize>(
     let cpu_res = timed!(
         timing,
         "simulate CPU",
-        simulate_cpu(&mut state, *max_cpu_len_log, *is_dummy)
+        simulate_cpu(&mut state, *max_cpu_len_log)
     );
     if cpu_res.is_err() {
         output_debug_tries(&state)?;
@@ -500,7 +502,6 @@ pub fn generate_traces<F: RichField + Extendable<D>, const D: usize>(
 fn simulate_cpu<F: Field>(
     state: &mut GenerationState<F>,
     max_cpu_len_log: Option<usize>,
-    is_dummy: bool,
 ) -> anyhow::Result<(RegistersState, Option<MemoryState>)> {
     let (final_registers, mem_after) = state.run_cpu(max_cpu_len_log)?;
 

diff --git a/evm_arithmetization/src/generation/prover_input.rs b/evm_arithmetization/src/generation/prover_input.rs
@@ -74,7 +74,7 @@ impl<F: Field> GenerationState<F> {
     fn run_end_of_txns(&mut self) -> Result<U256, ProgramError> {
         // Reset the jumpdest table before the next transaction.
         self.jumpdest_table = None;
-        let end = self.next_txn_index == self.inputs.txns_len;
+        let end = self.next_txn_index == self.inputs.txn_hashes.len();
         if end {
             Ok(U256::one())
         } else {

diff --git a/evm_arithmetization/src/generation/state.rs b/evm_arithmetization/src/generation/state.rs
@@ -28,8 +28,8 @@ use crate::prover::GenerationSegmentData;
 use crate::util::u256_to_usize;
 use crate::witness::errors::ProgramError;
 use crate::witness::memory::MemoryChannel::GeneralPurpose;
+use crate::witness::memory::MemoryOpKind;
 use crate::witness::memory::{MemoryAddress, MemoryOp, MemoryState};
-use crate::witness::memory::{MemoryContextState, MemoryOpKind};
 use crate::witness::operation::{generate_exception, Operation};
 use crate::witness::state::RegistersState;
 use crate::witness::traces::{TraceCheckpoint, Traces};
@@ -331,7 +331,7 @@ pub(crate) trait State<F: Field> {
     }
 }
 
-#[derive(Debug)]
+#[derive(Debug, Default)]
 pub struct GenerationState<F: Field> {
     pub(crate) inputs: TrimmedGenerationInputs,
     pub(crate) registers: RegistersState,
@@ -435,6 +435,22 @@ impl<F: Field> GenerationState<F> {
         Ok(state)
     }
 
+    pub(crate) fn new_with_segment_data(
+        trimmed_inputs: &TrimmedGenerationInputs,
+        segment_data: &GenerationSegmentData,
+    ) -> Result<Self, ProgramError> {
+        let mut state = Self {
+            inputs: trimmed_inputs.clone(),
+            ..Default::default()
+        };
+
+        state.memory.preinitialized_segments = segment_data.memory.preinitialized_segments.clone();
+
+        state.set_segment_data(segment_data);
+
+        Ok(state)
+    }
+
     /// Updates `program_counter`, and potentially adds some extra handling if
     /// we're jumping to a special location.
     pub(crate) fn jump_to(&mut self, dst: usize) -> Result<(), ProgramError> {