[move] Benchmarking historical transactions #15329

georgemitenkov · 2024-11-20T11:16:52Z

Description

This PR introduces a tool to benchmark past transactions (and correctly, unlike existing aptos-debugger move execute-past-transactions ...).

Summary

Aptos debugger tool uses an RPC under the hood when running transactions. This means state view access latencies can be huge. Also, only executes transactions in blocks, without shared caches across blocks like the real executor does.

This PR introduces a better tool to benchmark execution of past transactions. The user has to provide first and last versions of the interval that need to be benchmarked. The tool partitions transactions in the specified closed interval into blocks, and runs all blocks end-to-end, measuring the overall time. During this run, executor is shared (and so are environment and module caches).

There is no commit here, only execution time. For each block, we maintain the state (read-set estimated from 1 run before the benchmarks) on top of which it should run. And so, the benchmark just runs in sequence blocks on top of their initial states (outputs are only used for comparison against the on-chain data).

The tool allows one to override configs to experiment with new features, or with how the execution would look like without some features. For now, we support:

enabling features
disabling features
In the future, we can add more overrides: gas schedule, modules, etc.

It also computes the diffs between expected and new overridden outputs, e.g.:

Transaction 1944524533 diff:
  >>>>>
[gas used] before: 525, after: 517
[event] 0000000000000000000000000000000000000000000000000000000000000001::transaction_fee::FeeStatement has changed its data
[write] StateKey::AccessPath { address: 0xc05e013f9f81d699e5a991bd134ab8a9af4ec609714f1fd8bfce4fc38419c890, path: "Resource(0x1::coin::CoinStore<0x1::aptos_coin::AptosCoin>)" } has changed its value
[write] StateKey::TableItem { handle: 1b854694ae746cdbd8d44186ca4929b2b337df21d1c74633be19b2710552fdca, key: 0619dc29a0aac8fa146714058e8dd6d2d0f3bdf5f6331907bf91f3acd81e6935 } has changed its value
[total gas used] before: 525, after: 517
 <<<<<

Example

For example, say we have a new feature, e.g., ENABLE_LOADER_V2. We can benchmark how historical transactions perform with this flag on/off.

The flag is off by default:

target/release/aptos-replay-benchmark --begin-version 1944524532 \
  --end-version 1944524714 --rest-endpoint https://mainnet.aptoslabs.com/v1 \
  --num-repeats 10 --concurrency-levels 8

Got 100/183 txns from RestApi.
Got 183/183 txns from RestApi.
Generating blocks for benchmarking ...
Checking generated blocks ...
Analyzing 24 generated blocks ...
Block 1: versions [1944524532, 1944524541] with 10 transactions
Block 2: versions [1944524542, 1944524546] with 5 transactions
Block 3: versions [1944524547, 1944524552] with 6 transactions
Block 4: versions [1944524553, 1944524560] with 8 transactions
Block 5: versions [1944524561, 1944524565] with 5 transactions
Block 6: versions [1944524566, 1944524572] with 7 transactions
Block 7: versions [1944524573, 1944524577] with 5 transactions
Block 8: versions [1944524578, 1944524587] with 10 transactions
Block 9: versions [1944524588, 1944524597] with 10 transactions
Block 10: versions [1944524598, 1944524602] with 5 transactions
Block 11: versions [1944524603, 1944524609] with 7 transactions
Block 12: versions [1944524610, 1944524616] with 7 transactions
Block 13: versions [1944524617, 1944524624] with 8 transactions
Block 14: versions [1944524625, 1944524632] with 8 transactions
Block 15: versions [1944524633, 1944524639] with 7 transactions
Block 16: versions [1944524640, 1944524647] with 8 transactions
Block 17: versions [1944524648, 1944524656] with 9 transactions
Block 18: versions [1944524657, 1944524663] with 7 transactions
Block 19: versions [1944524664, 1944524668] with 5 transactions
Block 20: versions [1944524669, 1944524680] with 12 transactions
Block 21: versions [1944524681, 1944524687] with 7 transactions
Block 22: versions [1944524688, 1944524695] with 8 transactions
Block 23: versions [1944524696, 1944524705] with 10 transactions
Block 24: versions [1944524706, 1944524714] with 9 transactions
Benchmarking ...

Concurrency level: 8
[1/10] Execution time is 1717ms
[2/10] Execution time is 1540ms
[3/10] Execution time is 1537ms
[4/10] Execution time is 1606ms
[5/10] Execution time is 1629ms
[6/10] Execution time is 1668ms
[7/10] Execution time is 1619ms
[8/10] Execution time is 1479ms
[9/10] Execution time is 1690ms
[10/10] Execution time is 1515ms
Median execution time is 1619ms

With the flag on:

target/release/aptos-replay-benchmark --begin-version 1944524532 \
  --end-version 1944524714 --rest-endpoint https://mainnet.aptoslabs.com/v1 \
  --num-repeats 10 --concurrency-levels 8 --enable-features ENABLE_LOADER_V2

Got 100/183 txns from RestApi.
Got 183/183 txns from RestApi.
Generating blocks for benchmarking ...
Checking generated blocks ...
Analyzing 24 generated blocks ...
Block 1: versions [1944524532, 1944524541] with 10 transactions
Block 2: versions [1944524542, 1944524546] with 5 transactions
Block 3: versions [1944524547, 1944524552] with 6 transactions
Block 4: versions [1944524553, 1944524560] with 8 transactions
Block 5: versions [1944524561, 1944524565] with 5 transactions
Block 6: versions [1944524566, 1944524572] with 7 transactions
Block 7: versions [1944524573, 1944524577] with 5 transactions
Block 8: versions [1944524578, 1944524587] with 10 transactions
Block 9: versions [1944524588, 1944524597] with 10 transactions
Block 10: versions [1944524598, 1944524602] with 5 transactions
Block 11: versions [1944524603, 1944524609] with 7 transactions
Block 12: versions [1944524610, 1944524616] with 7 transactions
Block 13: versions [1944524617, 1944524624] with 8 transactions
Block 14: versions [1944524625, 1944524632] with 8 transactions
Block 15: versions [1944524633, 1944524639] with 7 transactions
Block 16: versions [1944524640, 1944524647] with 8 transactions
Block 17: versions [1944524648, 1944524656] with 9 transactions
Block 18: versions [1944524657, 1944524663] with 7 transactions
Block 19: versions [1944524664, 1944524668] with 5 transactions
Block 20: versions [1944524669, 1944524680] with 12 transactions
Block 21: versions [1944524681, 1944524687] with 7 transactions
Block 22: versions [1944524688, 1944524695] with 8 transactions
Block 23: versions [1944524696, 1944524705] with 10 transactions
Block 24: versions [1944524706, 1944524714] with 9 transactions
Benchmarking ...

Concurrency level: 8
[1/10] Execution time is 899ms
[2/10] Execution time is 905ms
[3/10] Execution time is 901ms
[4/10] Execution time is 903ms
[5/10] Execution time is 905ms
[6/10] Execution time is 905ms
[7/10] Execution time is 899ms
[8/10] Execution time is 896ms
[9/10] Execution time is 898ms
[10/10] Execution time is 904ms
Median execution time is 903ms

Great - we now can quantify the effect of the feature on runtime.

Other related changes

Refactored logging Level to be able to use it from CLI. The behaviour should be the same.
Replaced BlockAptosVM::execute_block with AptosVMBlockExecutor::new().execute_block where possible (benchmark, debugger) so that we use the high-level wrapper, and not the inner type.

How Has This Been Tested?

Manually running benchmarks.

Key Areas to Review

N/A, probably checking logger's Level is still correct.

Type of Change

New feature

Which Components or Systems Does This Change Impact?

Move/Aptos Virtual Machine
Developer Infrastructure

Checklist

I have read and followed the CONTRIBUTING doc
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I identified and added all stakeholders and component owners affected by this change as reviewers
I tested both happy and unhappy path of the functionality
I have made corresponding changes to the documentation

trunk-io · 2024-11-20T11:16:56Z

⏱️ 2h 53m total CI duration on this PR

Slowest 15 Jobs	Cumulative Duration	Recent Runs
rust-cargo-deny	23m	🟩 🟩 🟩 🟩 🟩 (+8 more)
check-dynamic-deps	14m	🟩 🟩 🟩 🟩 🟩 (+8 more)
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	12m	🟩
rust-move-tests	12m	🟩
rust-move-tests	12m	🟩
rust-move-tests	8m	⬜
general-lints	7m	🟩 🟩 🟩 🟩 🟩 (+8 more)
semgrep/ci	6m	🟩 🟩 🟩 🟩 🟩 (+8 more)
rust-move-tests	3m	⬜
file_change_determinator	3m	🟩 🟩 🟩 🟩 🟩 (+8 more)

_{settings ⋅ feedback ⋅ docs ⋅ learn more about trunk.io}

georgemitenkov · 2024-11-20T11:17:09Z

[move] Benchmarking historical transactions #15329 👈 (View in Graphite)
[loader-v2] Addressing simple loader V2 TODOs #15316 : 1 other dependent PR (#15315 )
[loader-v2] Small cleanups & tests #15279 : 1 other dependent PR (#15280 )
[loader-v2] Fixing global cache reads & read-before-write on publish #15285
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

aptos-move/aptos-debugger/src/benchmark_past_transactions.rs

aptos-move/aptos-replay-benchmark/src/main.rs

georgemitenkov mentioned this pull request Nov 20, 2024

[loader-v2] migrate Move transactional and integration tests to V2 only #15315

Open

9 tasks

georgemitenkov mentioned this pull request Nov 20, 2024

[loader-v2] Addressing simple loader V2 TODOs #15316

Merged

9 tasks

georgemitenkov changed the title ~~[refactoring] Use AptosVMBlockExecutor where possible~~ [aptos-debugger] Correct benchmark via debugger Nov 20, 2024

georgemitenkov force-pushed the george/loader-v2-benchmark branch from 5f0697c to 9d3bc98 Compare November 20, 2024 18:04

graphite-app bot reviewed Nov 20, 2024

View reviewed changes

aptos-move/aptos-debugger/src/benchmark_past_transactions.rs Outdated Show resolved Hide resolved

georgemitenkov force-pushed the george/loader-v2-todos-script-location branch from 3b035a7 to 429acfd Compare November 20, 2024 18:06

georgemitenkov force-pushed the george/loader-v2-benchmark branch 2 times, most recently from 4beb0d6 to 67ec63d Compare November 20, 2024 18:08

graphite-app bot reviewed Nov 20, 2024

View reviewed changes

aptos-move/aptos-debugger/src/benchmark_past_transactions.rs Outdated Show resolved Hide resolved

gelash reviewed Nov 20, 2024

View reviewed changes

aptos-move/aptos-debugger/src/benchmark_past_transactions.rs Outdated Show resolved Hide resolved

georgemitenkov force-pushed the george/loader-v2-todos-script-location branch from 429acfd to a351639 Compare November 20, 2024 21:16

georgemitenkov force-pushed the george/loader-v2-benchmark branch from 67ec63d to 2d8c42b Compare November 20, 2024 21:17

Base automatically changed from george/loader-v2-todos-script-location to main November 20, 2024 21:51

georgemitenkov force-pushed the george/loader-v2-benchmark branch 2 times, most recently from 9ff5305 to 8d71d53 Compare November 21, 2024 16:01

graphite-app bot reviewed Nov 21, 2024

View reviewed changes

aptos-move/aptos-replay-benchmark/src/main.rs Outdated Show resolved Hide resolved

graphite-app bot reviewed Nov 21, 2024

View reviewed changes

aptos-move/aptos-replay-benchmark/src/main.rs Outdated Show resolved Hide resolved

georgemitenkov force-pushed the george/loader-v2-benchmark branch 3 times, most recently from bc4d63b to e5d9058 Compare November 21, 2024 16:26

georgemitenkov marked this pull request as ready for review November 21, 2024 16:42

georgemitenkov requested review from gregnazario, JoshLind, davidiw, wrwg, zekun000 and vgao1996 as code owners November 21, 2024 16:42

georgemitenkov changed the title ~~[aptos-debugger] Correct benchmark via debugger~~ [aptos-replay-benhcmark] Bencmarking historical transactions Nov 21, 2024

georgemitenkov changed the title ~~[aptos-replay-benhcmark] Bencmarking historical transactions~~ [move] Benchmarking historical transactions Nov 21, 2024

georgemitenkov requested review from msmouse, runtian-zhou, igor-aptos, gelash and ziaptos November 21, 2024 16:44

georgemitenkov force-pushed the george/loader-v2-benchmark branch 2 times, most recently from 570e144 to fc76656 Compare November 23, 2024 15:22

georgemitenkov added 4 commits November 25, 2024 10:24

[move] Replay benchmark tool

d4c4fd8

[refactor] Split between files

289aa44

add fine-grained comparisons

acd2d9c

make generation faster

85f4f11

georgemitenkov force-pushed the george/loader-v2-benchmark branch from 26a4e0a to 85f4f11 Compare November 25, 2024 10:25

georgemitenkov requested a review from rahxephon89 November 25, 2024 10:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[move] Benchmarking historical transactions #15329

[move] Benchmarking historical transactions #15329

georgemitenkov commented Nov 20, 2024 •

edited

Loading

trunk-io bot commented Nov 20, 2024 •

edited

Loading

georgemitenkov commented Nov 20, 2024 •

edited

Loading

[move] Benchmarking historical transactions #15329

Are you sure you want to change the base?

[move] Benchmarking historical transactions #15329

Conversation

georgemitenkov commented Nov 20, 2024 • edited Loading

Description

Summary

Example

Other related changes

How Has This Been Tested?

Key Areas to Review

Type of Change

Which Components or Systems Does This Change Impact?

Checklist

trunk-io bot commented Nov 20, 2024 • edited Loading

georgemitenkov commented Nov 20, 2024 • edited Loading

georgemitenkov commented Nov 20, 2024 •

edited

Loading

trunk-io bot commented Nov 20, 2024 •

edited

Loading

georgemitenkov commented Nov 20, 2024 •

edited

Loading