Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[move] Benchmarking historical transactions #15329

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

georgemitenkov
Copy link
Contributor

@georgemitenkov georgemitenkov commented Nov 20, 2024

Description

This PR introduces a tool to benchmark past transactions (and correctly, unlike existing aptos-debugger move execute-past-transactions ...).

Summary

Aptos debugger tool uses an RPC under the hood when running transactions. This means state view access latencies can be huge. Also, only executes transactions in blocks, without shared caches across blocks like the real executor does.

This PR introduces a better tool to benchmark execution of past transactions. The user has to provide first and last versions of the interval that need to be benchmarked. The tool partitions transactions in the specified closed interval into blocks, and runs all blocks end-to-end, measuring the overall time. During this run, executor is shared (and so are environment and module caches).

There is no commit here, only execution time. For each block, we maintain the state (read-set estimated from 1 run before the benchmarks) on top of which it should run. And so, the benchmark just runs in sequence blocks on top of their initial states (outputs are only used for comparison against the on-chain data).

The tool allows one to override configs to experiment with new features, or with how the execution would look like without some features. For now, we support:

  • enabling features
  • disabling features
    In the future, we can add more overrides: gas schedule, modules, etc.

It also computes the diffs between expected and new overridden outputs, e.g.:

Transaction 1944524533 diff:
  >>>>>
[gas used] before: 525, after: 517
[event] 0000000000000000000000000000000000000000000000000000000000000001::transaction_fee::FeeStatement has changed its data
[write] StateKey::AccessPath { address: 0xc05e013f9f81d699e5a991bd134ab8a9af4ec609714f1fd8bfce4fc38419c890, path: "Resource(0x1::coin::CoinStore<0x1::aptos_coin::AptosCoin>)" } has changed its value
[write] StateKey::TableItem { handle: 1b854694ae746cdbd8d44186ca4929b2b337df21d1c74633be19b2710552fdca, key: 0619dc29a0aac8fa146714058e8dd6d2d0f3bdf5f6331907bf91f3acd81e6935 } has changed its value
[total gas used] before: 525, after: 517
 <<<<<

Example

For example, say we have a new feature, e.g., ENABLE_LOADER_V2. We can benchmark how historical transactions perform with this flag on/off.

The flag is off by default:

target/release/aptos-replay-benchmark --begin-version 1944524532 \
  --end-version 1944524714 --rest-endpoint https://mainnet.aptoslabs.com/v1 \
  --num-repeats 10 --concurrency-levels 8

Got 100/183 txns from RestApi.
Got 183/183 txns from RestApi.
Generating blocks for benchmarking ...
Checking generated blocks ...
Analyzing 24 generated blocks ...
Block 1: versions [1944524532, 1944524541] with 10 transactions
Block 2: versions [1944524542, 1944524546] with 5 transactions
Block 3: versions [1944524547, 1944524552] with 6 transactions
Block 4: versions [1944524553, 1944524560] with 8 transactions
Block 5: versions [1944524561, 1944524565] with 5 transactions
Block 6: versions [1944524566, 1944524572] with 7 transactions
Block 7: versions [1944524573, 1944524577] with 5 transactions
Block 8: versions [1944524578, 1944524587] with 10 transactions
Block 9: versions [1944524588, 1944524597] with 10 transactions
Block 10: versions [1944524598, 1944524602] with 5 transactions
Block 11: versions [1944524603, 1944524609] with 7 transactions
Block 12: versions [1944524610, 1944524616] with 7 transactions
Block 13: versions [1944524617, 1944524624] with 8 transactions
Block 14: versions [1944524625, 1944524632] with 8 transactions
Block 15: versions [1944524633, 1944524639] with 7 transactions
Block 16: versions [1944524640, 1944524647] with 8 transactions
Block 17: versions [1944524648, 1944524656] with 9 transactions
Block 18: versions [1944524657, 1944524663] with 7 transactions
Block 19: versions [1944524664, 1944524668] with 5 transactions
Block 20: versions [1944524669, 1944524680] with 12 transactions
Block 21: versions [1944524681, 1944524687] with 7 transactions
Block 22: versions [1944524688, 1944524695] with 8 transactions
Block 23: versions [1944524696, 1944524705] with 10 transactions
Block 24: versions [1944524706, 1944524714] with 9 transactions
Benchmarking ...

Concurrency level: 8
[1/10] Execution time is 1717ms
[2/10] Execution time is 1540ms
[3/10] Execution time is 1537ms
[4/10] Execution time is 1606ms
[5/10] Execution time is 1629ms
[6/10] Execution time is 1668ms
[7/10] Execution time is 1619ms
[8/10] Execution time is 1479ms
[9/10] Execution time is 1690ms
[10/10] Execution time is 1515ms
Median execution time is 1619ms

With the flag on:

target/release/aptos-replay-benchmark --begin-version 1944524532 \
  --end-version 1944524714 --rest-endpoint https://mainnet.aptoslabs.com/v1 \
  --num-repeats 10 --concurrency-levels 8 --enable-features ENABLE_LOADER_V2

Got 100/183 txns from RestApi.
Got 183/183 txns from RestApi.
Generating blocks for benchmarking ...
Checking generated blocks ...
Analyzing 24 generated blocks ...
Block 1: versions [1944524532, 1944524541] with 10 transactions
Block 2: versions [1944524542, 1944524546] with 5 transactions
Block 3: versions [1944524547, 1944524552] with 6 transactions
Block 4: versions [1944524553, 1944524560] with 8 transactions
Block 5: versions [1944524561, 1944524565] with 5 transactions
Block 6: versions [1944524566, 1944524572] with 7 transactions
Block 7: versions [1944524573, 1944524577] with 5 transactions
Block 8: versions [1944524578, 1944524587] with 10 transactions
Block 9: versions [1944524588, 1944524597] with 10 transactions
Block 10: versions [1944524598, 1944524602] with 5 transactions
Block 11: versions [1944524603, 1944524609] with 7 transactions
Block 12: versions [1944524610, 1944524616] with 7 transactions
Block 13: versions [1944524617, 1944524624] with 8 transactions
Block 14: versions [1944524625, 1944524632] with 8 transactions
Block 15: versions [1944524633, 1944524639] with 7 transactions
Block 16: versions [1944524640, 1944524647] with 8 transactions
Block 17: versions [1944524648, 1944524656] with 9 transactions
Block 18: versions [1944524657, 1944524663] with 7 transactions
Block 19: versions [1944524664, 1944524668] with 5 transactions
Block 20: versions [1944524669, 1944524680] with 12 transactions
Block 21: versions [1944524681, 1944524687] with 7 transactions
Block 22: versions [1944524688, 1944524695] with 8 transactions
Block 23: versions [1944524696, 1944524705] with 10 transactions
Block 24: versions [1944524706, 1944524714] with 9 transactions
Benchmarking ...

Concurrency level: 8
[1/10] Execution time is 899ms
[2/10] Execution time is 905ms
[3/10] Execution time is 901ms
[4/10] Execution time is 903ms
[5/10] Execution time is 905ms
[6/10] Execution time is 905ms
[7/10] Execution time is 899ms
[8/10] Execution time is 896ms
[9/10] Execution time is 898ms
[10/10] Execution time is 904ms
Median execution time is 903ms

Great - we now can quantify the effect of the feature on runtime.

Other related changes

  1. Refactored logging Level to be able to use it from CLI. The behaviour should be the same.
  2. Replaced BlockAptosVM::execute_block with AptosVMBlockExecutor::new().execute_block where possible (benchmark, debugger) so that we use the high-level wrapper, and not the inner type.

How Has This Been Tested?

Manually running benchmarks.

Key Areas to Review

N/A, probably checking logger's Level is still correct.

Type of Change

  • New feature

Which Components or Systems Does This Change Impact?

  • Move/Aptos Virtual Machine
  • Developer Infrastructure

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Nov 20, 2024

⏱️ 2h 53m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
rust-cargo-deny 23m 🟩🟩🟩🟩🟩 (+8 more)
check-dynamic-deps 14m 🟩🟩🟩🟩🟩 (+8 more)
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 12m 🟩
rust-move-tests 12m 🟩
rust-move-tests 12m 🟩
rust-move-tests 8m
general-lints 7m 🟩🟩🟩🟩🟩 (+8 more)
semgrep/ci 6m 🟩🟩🟩🟩🟩 (+8 more)
rust-move-tests 3m
file_change_determinator 3m 🟩🟩🟩🟩🟩 (+8 more)

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link
Contributor Author

georgemitenkov commented Nov 20, 2024

@georgemitenkov georgemitenkov changed the title [refactoring] Use AptosVMBlockExecutor where possible [aptos-debugger] Correct benchmark via debugger Nov 20, 2024
@georgemitenkov georgemitenkov force-pushed the george/loader-v2-todos-script-location branch from 3b035a7 to 429acfd Compare November 20, 2024 18:06
@georgemitenkov georgemitenkov force-pushed the george/loader-v2-benchmark branch 2 times, most recently from 4beb0d6 to 67ec63d Compare November 20, 2024 18:08
@georgemitenkov georgemitenkov force-pushed the george/loader-v2-todos-script-location branch from 429acfd to a351639 Compare November 20, 2024 21:16
Base automatically changed from george/loader-v2-todos-script-location to main November 20, 2024 21:51
@georgemitenkov georgemitenkov force-pushed the george/loader-v2-benchmark branch 2 times, most recently from 9ff5305 to 8d71d53 Compare November 21, 2024 16:01
@georgemitenkov georgemitenkov force-pushed the george/loader-v2-benchmark branch 3 times, most recently from bc4d63b to e5d9058 Compare November 21, 2024 16:26
@georgemitenkov georgemitenkov marked this pull request as ready for review November 21, 2024 16:42
@georgemitenkov georgemitenkov changed the title [aptos-debugger] Correct benchmark via debugger [aptos-replay-benhcmark] Bencmarking historical transactions Nov 21, 2024
@georgemitenkov georgemitenkov changed the title [aptos-replay-benhcmark] Bencmarking historical transactions [move] Benchmarking historical transactions Nov 21, 2024
@georgemitenkov georgemitenkov force-pushed the george/loader-v2-benchmark branch 2 times, most recently from 570e144 to fc76656 Compare November 23, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants