Support `hashtree` crate with feature, add Criterion benchmarks #161

mempirate · 2024-09-06T09:48:51Z

Implements a new hash_chunks method that is used everywhere and will vary on implementation based on the features enabled.

Related to #156 but further optimizations are possible.

Notes

All hashing is now abstracted away over a single hash_chunks function, whose implementation differs based on the enabled features. I've tried adding a generic hasher trait and using dynamic dispatch, but this became pretty messy. I think this is cleaner.
Had to upgrade rustc to 1.74.0 to get Criterion working

Benchmarks

sha2 (default)

hash_tree_root_sha256   time:   [49.648 µs 49.814 µs 50.056 µs]
                        change: [-0.3203% +0.5809% +1.7785%] (p = 0.34 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking generate_proof_sha256: Warming up for 3.0000 s
generate_proof_sha256   time:   [811.00 ms 815.02 ms 819.58 ms]
Found 11 outliers among 100 measurements (11.00%)
  6 (6.00%) high mild
  5 (5.00%) high severe

sha2-asm

hash_tree_root_sha256-asm
                        time:   [10.242 µs 10.252 µs 10.263 µs]
                        change: [+0.3393% +0.5229% +0.6984%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild
  3 (3.00%) high severe

Benchmarking generate_proof_sha256-asm: Warming up for 3.0000 s
generate_proof_sha256-asm
                        time:   [132.81 ms 133.67 ms 135.02 ms]
Found 11 outliers among 100 measurements (11.00%)
  9 (9.00%) high mild
  2 (2.00%) high severe

hashtree

hash_tree_root_hashtree time:   [8.2075 µs 8.2200 µs 8.2385 µs]
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

Benchmarking generate_proof_hashtree: Warming up for 3.0000 s
generate_proof_hashtree time:   [111.12 ms 111.86 ms 112.93 ms]
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

In these benchmarks, hashtree is ~16% faster at generating proofs and ~20% faster at calculating hash tree roots compared to sha2-asm.

mempirate · 2024-09-06T18:18:51Z

With the latest commit getting rid of heap allocation for chunks and replacing it with an array, benchmarks were even better for hashtree:

hash_tree_root_hashtree time:   [7.8244 µs 7.8345 µs 7.8479 µs]
                        change: [-4.8425% -4.6332% -4.4083%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

Benchmarking generate_proof_hashtree: Warming up for 3.0000 s
generate_proof_hashtree time:   [97.430 ms 97.634 ms 97.872 ms]
                        change: [-13.578% -12.719% -12.096%] (p = 0.00 < 0.05)
                        Performance has improved.

potuz · 2024-09-09T10:48:53Z

I think this shouldn't close #156 this PR creates a wrapper that hashes two chunks, not necessarily contiguous with hashtree. As such it only benefits from the padding block prescheduling of hashtree but none of the vectorized/parallel pipelining. This is why you are getting only 16%--20% benefits from it. When hashing two chunks at the time you also incur on copies and allocations for every chunk you hash, something that is utterly unnecessary in most situations.

The approach to actually gain from this library is to hash one entire layer at the time, by passing all the layer, contiguously allocated, to the chunks slice in the function https://github.com/prysmaticlabs/hashtree/blob/main/examples/basic_usage.rs#L36

You can hash a Merkle tree of base 2^{N+1} by allocating only 2^N for the first layer, and then rewriting that allocated layer on successive runs, by passing the same slice as chunks and out.

mempirate · 2024-09-09T16:52:19Z

@potuz agreed this doesn't resolve the whole issue. I will follow this up with another PR for the other performance improvements you mentioned.

mempirate · 2024-12-11T08:48:25Z

Closing in favor of #172 #169

mempirate added 4 commits September 6, 2024 11:48

feat: support hashtree hashing with feature

1fdb1ec

feat: add criterion benchmarks

aba2b56

clippy: fmt

e90cdbf

refactor: rename benches

80e6620

mempirate marked this pull request as ready for review September 6, 2024 10:41

feat: get rid of allocation in hash_chunks_hashtree

6163758

mempirate added 2 commits September 6, 2024 20:33

fix: add chunk length assertions in hash_chunks_sha256

2f5fbd5

refactor: everything in hash_chunks

f8f2e2c

mempirate changed the title ~~Support hashtree crate with feature~~ Support hashtree crate with feature, add Criterion benchmarks Sep 7, 2024

estensen mentioned this pull request Nov 20, 2024

feat: add hash_tree_root() bench #168

Draft

This was referenced Dec 3, 2024

perf: compute subtrees in parallel when computing proof #169

Open

feat: support hashtree sha256 lib #172

Open

mempirate closed this Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `hashtree` crate with feature, add Criterion benchmarks #161

Support `hashtree` crate with feature, add Criterion benchmarks #161

mempirate commented Sep 6, 2024 •

edited

Loading

mempirate commented Sep 6, 2024 •

edited

Loading

potuz commented Sep 9, 2024

mempirate commented Sep 9, 2024

mempirate commented Dec 11, 2024

Support hashtree crate with feature, add Criterion benchmarks #161

Support hashtree crate with feature, add Criterion benchmarks #161

Conversation

mempirate commented Sep 6, 2024 • edited Loading

Notes

Benchmarks

mempirate commented Sep 6, 2024 • edited Loading

potuz commented Sep 9, 2024

mempirate commented Sep 9, 2024

mempirate commented Dec 11, 2024

Support `hashtree` crate with feature, add Criterion benchmarks #161

Support `hashtree` crate with feature, add Criterion benchmarks #161

mempirate commented Sep 6, 2024 •

edited

Loading

mempirate commented Sep 6, 2024 •

edited

Loading