From 2458cc105a9fb99ec7e33df68e7d748d5ec68bb4 Mon Sep 17 00:00:00 2001 From: Eagle941 <8973725+Eagle941@users.noreply.github.com> Date: Wed, 18 Sep 2024 21:05:46 +0100 Subject: [PATCH] chore: added benchmarks documentation --- docs/BENCHMARK.md | 93 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) create mode 100644 docs/BENCHMARK.md diff --git a/docs/BENCHMARK.md b/docs/BENCHMARK.md new file mode 100644 index 0000000000..7fd20445f0 --- /dev/null +++ b/docs/BENCHMARK.md @@ -0,0 +1,93 @@ +# Addition of `VisitedPcs` trait + +Current blockifier doesn't store the complete vector of visited program counters +for each entry point call in an invoke transaction. Instead, visited program +counters are pushed in a `HashSet`. This is a limiting factor to perform +profiling operations which require record of the full trace returned by +`cairo-vm`. More flexibility is added to the blockifier with the introduction of +trait `VisitedPcs` which allows the user to process visited program counters in +the most suitable way for the task. + +## Existing code + +Visited program counters are kept in the `CachedState` structure as shown below: + +```rust +#[derive(Debug)] +pub struct CachedState { + pub state: S, + // Invariant: read/write access is managed by CachedState. + // Using interior mutability to update caches during `State`'s immutable getters. + pub(crate) cache: RefCell, + pub(crate) class_hash_to_class: RefCell, + /// A map from class hash to the set of PC values that were visited in the class. + pub visited_pcs: HashMap>, +} +``` + +## New code + +`VisitedPcs` is an additional generic parameter of `CachedState`. + +```rust +#[derive(Debug)] +pub struct CachedState { + pub state: S, + // Invariant: read/write access is managed by CachedState. + // Using interior mutability to update caches during `State`'s immutable getters. + pub(crate) cache: RefCell, + pub(crate) class_hash_to_class: RefCell, + /// A map from class hash to the set of PC values that were visited in the class. + pub visited_pcs: V, +} +``` + +An implementation of the trait `VisitedPcs` is included in the blockifier with +the name `VisitedPcsSet` and it mimics the existing `HashSet`. Also, for +test purposes, `CachedState` is instantiated using `VisitedPcsSet`. + +## Performance considerations + +Given the importance of the blockifier in the Starknet ecosystem, we want to +measure the performance impact of adding the trait `VisitedPcs`. The existing +bechmark `transfers` doesn't cover operations with `CachedState` therefore we +need to design new ones. We have created two new benchmarks: + +- `cached_state`: this benchmark tests the performance impact of populating + `visited_pcs` (implemented using `VisitedPcsSet`) with a realistic amount of + visited program counters. The size of the sets is taken from transaction + `0x0177C9365875CAA840EA8F03F97B0E3A8EE8851A8B952BF157B5DBD4FECCB060` in the + mainnet. This transaction has been chosen randomly, but there is no assurance + that it's representative of the most common Starknet invoke transaction. This + benchmark tests the write performance of visited program counters in the state + struct. +- `execution`: this benchmark simulates a whole invoke transaction using a dummy + contract. + +## Performance impact + +A script `bench.sh` has been added to benchmark the performance impact of these +changes: it is called as +`bash scripts/bench.sh 14e6a87722c1d0c757b1aa2756ffabe3f248fd7d e39ae0be4cec31938399199e0a1070279b4a78ed`. +The computer running the benchmark is: Debian VM over Windows 10 with VMWare +Workstation 17, i9-9900K, 64GB RAM, Samsung 990 Pro NVME SSD. + +The Rust toolchain used is: +``` +1.78-x86_64-unknown-linux-gnu (default) +rustc 1.78.0 (9b00956e5 2024-04-29) +``` + +Noise threshold and confidence intervals are kept as per default Criterion.rs +configuration. + +The results are shown in the following table: + +| Benchmark | Time (ms) | Time change (%) | Criterion.rs report | +| ------------ | --------- | --------------- | ----------------------------- | +| transfers | 94.448 | +0.1080 | No change in performance | +| execution | 1.2882 | -1.7216 | Change within noise threshold | +| cached_state | 5.2330 | -0.8703 | No change in performance | + +The analysis of Criterion.rs determines that there isn't statistically +significant performance decrese.