Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: update benchmarks #550

Merged
merged 12 commits into from
Oct 24, 2024
Merged
115 changes: 59 additions & 56 deletions benchmark/fft/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

```
Run on 13th Gen Intel(R) Core(TM) i9-13900K (32 X 5500 MHz CPU s)
Compiler: clang-15
CPU Caches:
L1 Data 48 KiB (x16)
L1 Instruction 32 KiB (x16)
Expand All @@ -17,75 +18,77 @@ CPU Caches:
L2 Unified 4096 KiB (x12)
```

Note: Run with `build --@rules_rust//:extra_rustc_flags="-Ctarget-cpu=native"` in your .bazelrc.user

### FFT

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --check_results
```

#### On Intel i9-13900K

| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | ------------ | -------- | -------- |
| 16 | **0.000958** | 0.004086 | 0.007342 | 0.003784 |
| 17 | 0.032529 | **0.003283** | 0.012624 | 0.005433 |
| 18 | 0.014067 | **0.005768** | 0.025811 | 0.009372 |
| 19 | **0.008459** | 0.011465 | 0.05208 | 0.019333 |
| 20 | **0.016166** | 0.024533 | 0.106217 | 0.042381 |
| 21 | **0.039447** | 0.069444 | 0.212414 | 0.087621 |
| 22 | **0.125954** | 0.177245 | 0.431237 | 0.188843 |
| 23 | **0.297259** | 0.391987 | 0.835686 | 0.427426 |
| 16 | **0.002058** | 0.005143 | 0.006314 | 0.002249 |
| 17 | **0.002246** | 0.00334 | 0.015646 | 0.006193 |
| 18 | **0.010154** | 0.018807 | 0.046443 | 0.007574 |
| 19 | 0.022984 | **0.014652** | 0.076281 | 0.014506 |
| 20 | **0.02** | 0.02497 | 0.100082 | 0.042877 |
| 21 | **0.044831** | 0.075563 | 0.20222 | 0.067161 |
| 22 | **0.130201** | 0.179075 | 0.402452 | 0.169194 |
| 23 | **0.281398** | 0.394068 | 0.792004 | 0.372566 |

![image](/benchmark/fft/fft_benchmark_ubuntu_i9.png)

#### On Mac M3 Pro

| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | ------------ | -------- | -------- |
| 16 | **0.002735** | 0.003468 | 0.007731 | 0.006372 |
| 17 | **0.005237** | 0.006043 | 0.015891 | 0.012804 |
| 18 | **0.009494** | 0.010686 | 0.027312 | 0.02485 |
| 19 | 0.020251 | **0.020156** | 0.055652 | 0.045714 |
| 20 | **0.038186** | 0.040006 | 0.110531 | 0.096778 |
| 21 | **0.085204** | 0.087181 | 0.228044 | 0.191695 |
| 22 | **0.166863** | 0.179635 | 0.472941 | 0.386844 |
| 23 | **0.347128** | 0.378249 | 0.970552 | 0.814043 |
| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | -------- | -------- | -------- |
| 16 | **0.002526** | 0.003804 | 0.00784 | 0.005689 |
| 17 | **0.004694** | 0.005769 | 0.015577 | 0.01121 |
| 18 | **0.009246** | 0.010243 | 0.027834 | 0.022379 |
| 19 | **0.018328** | 0.020404 | 0.055661 | 0.041394 |
| 20 | **0.039683** | 0.041085 | 0.110702 | 0.086299 |
| 21 | **0.079138** | 0.087336 | 0.230857 | 0.175599 |
| 22 | **0.166646** | 0.177959 | 0.474296 | 0.352872 |
| 23 | **0.33996** | 0.363612 | 0.971581 | 0.748284 |

![image](/benchmark/fft/fft_benchmark_mac_m3.png)

### IFFT

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --run_ifft --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --vendor arkworks --vendor bellman --vendor halo2 --run_ifft --check_results
```

#### On Intel i9-13900K

| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | ------------ | -------- | ----------- |
| 16 | 0.003078 | 0.004531 | 0.007794 | **0.00297** |
| 17 | 0.011666 | **0.005005** | 0.012804 | 0.005309 |
| 18 | **0.005614** | 0.009204 | 0.025717 | 0.009741 |
| 19 | **0.007625** | 0.015332 | 0.050253 | 0.018729 |
| 20 | **0.016751** | 0.030142 | 0.111549 | 0.041873 |
| 21 | **0.039565** | 0.0715 | 0.222403 | 0.098125 |
| 22 | **0.140152** | 0.181124 | 0.415709 | 0.188011 |
| 23 | **0.317353** | 0.400472 | 0.845031 | 0.407396 |
| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | -------- | -------- | ------------ |
| 16 | **0.001392** | 0.012028 | 0.009913 | 0.002413 |
| 17 | **0.002511** | 0.00427 | 0.01418 | 0.005731 |
| 18 | 0.01762 | 0.021167 | 0.034676 | **0.010811** |
| 19 | **0.009646** | 0.01447 | 0.058714 | 0.016038 |
| 20 | **0.030303** | 0.034815 | 0.104936 | 0.05337 |
| 21 | **0.047463** | 0.072579 | 0.199788 | 0.093146 |
| 22 | **0.146697** | 0.181389 | 0.391296 | 0.19874 |
| 23 | **0.285937** | 0.403596 | 0.82276 | 0.347876 |

![image](/benchmark/fft/ifft_benchmark_ubuntu_i9.png)

#### On Mac M3 Pro

| Exponent | Tachyon | Arkworks | Bellman | Halo2 |
| :------: | ------------ | -------- | -------- | -------- |
| 16 | **0.002766** | 0.004274 | 0.007948 | 0.006638 |
| 17 | **0.005883** | 0.006978 | 0.016308 | 0.013121 |
| 18 | **0.010532** | 0.012815 | 0.029066 | 0.028791 |
| 19 | **0.020781** | 0.024054 | 0.059351 | 0.048824 |
| 20 | **0.041061** | 0.048806 | 0.11825 | 0.099004 |
| 21 | **0.090855** | 0.101232 | 0.236775 | 0.210805 |
| 22 | **0.170776** | 0.203109 | 0.488306 | 0.423618 |
| 23 | **0.383255** | 0.454968 | 1.03129 | 0.881795 |
| 16 | **0.002798** | 0.003867 | 0.008102 | 0.005665 |
| 17 | **0.004882** | 0.005737 | 0.015998 | 0.011672 |
| 18 | **0.010308** | 0.010962 | 0.028118 | 0.022723 |
| 19 | **0.018724** | 0.021338 | 0.056855 | 0.042554 |
| 20 | **0.037687** | 0.043237 | 0.113848 | 0.089899 |
| 21 | **0.078429** | 0.092134 | 0.234585 | 0.174939 |
| 22 | **0.162542** | 0.189442 | 0.484644 | 0.361127 |
| 23 | **0.338646** | 0.392674 | 0.989173 | 0.765592 |

![image](/benchmark/fft/ifft_benchmark_mac_m3.png)

Expand All @@ -94,41 +97,41 @@ bazel run --config opt --//:has_matplotlib //benchmark/fft:fft_benchmark -- -k 1
### FFT

```shell
bazel run --config opt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --check_results
```

#### On RTX-4090

| Exponent | Tachyon CPU | Tachyon GPU |
| :------: | ----------- | ------------ |
| 16 | **0.00097** | 0.001231 |
| 17 | 0.002156 | **0.000667** |
| 18 | 0.003524 | **0.001297** |
| 19 | 0.007366 | **0.002654** |
| 20 | 0.015787 | **0.005877** |
| 21 | 0.03753 | **0.012573** |
| 22 | 0.122167 | **0.027632** |
| 23 | 0.268875 | **0.055971** |
| 16 | 0.002348 | **0.001** |
| 17 | 0.00204 | **0.001182** |
| 18 | 0.00393 | **0.002211** |
| 19 | 0.009317 | **0.004079** |
| 20 | 0.049204 | **0.008114** |
| 21 | 0.044158 | **0.01616** |
| 22 | 0.134064 | **0.032785** |
| 23 | 0.274101 | **0.066068** |

![image](/benchmark/fft/fft_benchmark_ubuntu_rtx_4090.png)

### IFFT

```shell
bazel run --config opt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --run_ifft --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --config cuda --//:has_matplotlib //benchmark/fft:fft_benchmark_gpu -- -k 16 -k 17 -k 18 -k 19 -k 20 -k 21 -k 22 -k 23 --run_ifft --check_results
```

#### On RTX-4090

| Exponent | Tachyon | Tachyon GPU |
| :------: | -------- | ------------ |
| 16 | 0.000993 | **0.000833** |
| 17 | 0.001673 | **0.000643** |
| 18 | 0.003533 | **0.001305** |
| 19 | 0.007446 | **0.002701** |
| 20 | 0.016039 | **0.005882** |
| 21 | 0.03786 | **0.012817** |
| 22 | 0.126032 | **0.027767** |
| 23 | 0.32731 | **0.056064** |
| 16 | 0.002138 | **0.001341** |
| 17 | 0.00488 | **0.000933** |
| 18 | 0.003887 | **0.002502** |
| 19 | 0.00896 | **0.003806** |
| 20 | 0.017953 | **0.007745** |
| 21 | 0.043787 | **0.016268** |
| 22 | 0.132048 | **0.033012** |
| 23 | 0.291132 | **0.066022** |

![image](/benchmark/fft/ifft_benchmark_ubuntu_rtx_4090.png)
Binary file modified benchmark/fft/fft_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/fft_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/fft_benchmark_ubuntu_rtx_4090.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/ifft_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/ifft_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft/ifft_benchmark_ubuntu_rtx_4090.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
84 changes: 47 additions & 37 deletions benchmark/fft_batch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

```
Run on 13th Gen Intel(R) Core(TM) i9-13900K (32 X 5500 MHz CPU s)
Compiler: clang-15
CPU Caches:
L1 Data 48 KiB (x16)
L1 Instruction 32 KiB (x16)
Expand All @@ -17,70 +18,79 @@ CPU Caches:
L2 Unified 4096 KiB (x12)
```

### FFTBatch
Note: Run with `build --@rules_rust//:extra_rustc_flags="-Ctarget-cpu=native"` in your .bazelrc.user

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 -k 25 -k 26 --vendor plonky3 -p baby_bear --check_results
```
### FFTBatch

WARNING: On Mac M3, tests beyond degree 24 are not feasible due to memory constraints.

#### On Intel i9-13900K

```shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 -k 25 -k 26 --vendor plonky3 -p baby_bear --check_results
```

| Exponent | Tachyon | Plonky3 |
| :------- | ------------ | ------------ |
| 20 | 0.117925 | **0.110098** |
| 21 | 0.222959 | **0.218505** |
| 22 | 0.459209 | **0.447758** |
| 23 | 0.97874 | **0.964644** |
| 24 | 2.09675 | **2.092210** |
| 25 | **6.20441** | 6.98453 |
| 26 | **18.6084** | 20.7476 |
| 20 | **0.092595** | 0.094762 |
| 21 | **0.191168** | 0.193567 |
| 22 | 0.406239 | **0.384377** |
| 23 | 0.892501 | **0.842694** |
| 24 | 1.91177 | **1.90586** |
| 25 | **5.82862** | 7.34128 |
| 26 | **17.1807** | 20.3968 |

![image](/benchmark/fft_batch/fft_batch_benchmark_ubuntu_i9.png)

#### On Mac M3 Pro

| Exponent | Tachyon | Plonky3 |
| :------- | --------- | ------------ |
| 20 | 0.132521 | **0.072505** |
| 21 | 0.287744 | **0.140527** |
| 22 | 0.588894 | **0.280177** |
| 23 | 1.17446 | **0.621024** |
| 24 | 3.17213 | **2.399220** |
```shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 --vendor plonky3 -p baby_bear --check_results
```

| Exponent | Tachyon | Plonky3 |
| :------- | -------- | ------------ |
| 20 | 0.083416 | **0.066952** |
| 21 | 0.194191 | **0.138168** |
| 22 | 0.408045 | **0.299547** |
| 23 | 0.955439 | **0.679252** |
| 24 | 11.8495 | **6.47188** |

![image](/benchmark/fft_batch/fft_batch_benchmark_mac_m3.png)

### CosetLDEBatch

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 -k 25 --vendor plonky3 -p baby_bear --run_coset_lde --check_results
```

WARNING: On Mac M3, tests beyond degree 24 are not feasible due to memory constraints.
WARNING: On Intel i9-13900K, tests beyond degree 25 are not feasible due to memory constraints, and on Mac M3, tests beyond degree 24 are not feasible due to memory constraints.

#### On Intel i9-13900K

| Exponent | Tachyon | Plonky3 |
| :------- | ------------ | -------- |
| 20 | **0.414096** | 0.783275 |
| 21 | **0.828539** | 1.47701 |
| 22 | **1.784080** | 3.06198 |
| 23 | **3.673930** | 6.49181 |
| 24 | **9.325390** | 16.2383 |
| 25 | **25.66560** | 41.3335 |
```shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 -k 25 --vendor plonky3 -p baby_bear --run_coset_lde --check_results
```

| Exponent | Tachyon | Plonky3 |
| :------- | ----------- | -------- |
| 20 | **0.46917** | 0.639744 |
| 21 | **0.92528** | 1.2923 |
| 22 | **1.87363** | 2.68427 |
| 23 | **4.06008** | 5.67987 |
| 24 | **9.6627** | 14.6164 |
| 25 | **25.7953** | 39.5498 |

![image](/benchmark/fft_batch/coset_lde_batch_benchmark_ubuntu_i9.png)

#### On Mac M3 Pro

```shell
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fft_batch:fft_batch_benchmark -- -k 20 -k 21 -k 22 -k 23 -k 24 --vendor plonky3 -p baby_bear --run_coset_lde --check_results
```

| Exponent | Tachyon | Plonky3 |
| :------- | ------------ | ------------ |
| 18 | 0.100942 | **0.086087** |
| 19 | 0.214471 | **0.182212** |
| 20 | 0.481229 | **0.359246** |
| 21 | **0.981806** | 1.518190 |
| 22 | 3.86094 | **3.244580** |
| 23 | 7.50879 | **6.052250** |
| 20 | **0.318485** | 0.323865 |
| 21 | 0.667106 | **0.660975** |
| 22 | **1.44873** | 3.40795 |
| 23 | 8.27201 | **5.91238** |
| 24 | 39.9678 | **23.1033** |

![image](/benchmark/fft_batch/coset_lde_batch_benchmark_mac_m3.png)
Binary file modified benchmark/fft_batch/coset_lde_batch_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft_batch/coset_lde_batch_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft_batch/fft_batch_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fft_batch/fft_batch_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion benchmark/fft_batch/fft_batch_runner.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,18 @@ class FFTBatchRunner {
math::RowMajorMatrix<F> result;
std::unique_ptr<Domain> domain =
Domain::Create(static_cast<size_t>(input.rows()));
base::TimeTicks start = base::TimeTicks::Now();
base::TimeTicks start;
if (run_coset_lde) {
const size_t kAddedBits = 1;
result =
math::RowMajorMatrix<F>(input.rows() << kAddedBits, input.cols());
start = base::TimeTicks::Now();
domain->CosetLDEBatch(input, kAddedBits,
F::FromMontgomery(F::Config::kSubgroupGenerator),
result);
} else {
result = input;
start = base::TimeTicks::Now();
domain->FFTBatch(result);
}
reporter_.AddTime(vendor, base::TimeTicks::Now() - start);
Expand Down
31 changes: 17 additions & 14 deletions benchmark/fri/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@

## CPU

```bash
```
Run on 13th Gen Intel(R) Core(TM) i9-13900K (32 X 5500 MHz CPU s)
Compiler: clang-15
CPU Caches:
L1 Data 48 KiB (x16)
L1 Instruction 32 KiB (x16)
Expand All @@ -17,32 +18,34 @@ CPU Caches:
L2 Unified 4096 KiB (x12)
```

Note: Run with `build --@rules_rust//:extra_rustc_flags="-Ctarget-cpu=native"` in your .bazelrc.user

```shell
bazel run --config opt --//:has_matplotlib //benchmark/fri:fri_benchmark -- -k 18 -k 19 -k 20 -k 21 -k 22 --batch_size 100 --input_num 4 --round_num 4 --log_blowup 2 --vendor plonky3 --check_results
GOMP_SPINCOUNT=0 bazel run --config maxopt --//:has_matplotlib //benchmark/fri:fri_benchmark -- -k 18 -k 19 -k 20 -k 21 -k 22 --batch_size 100 --input_num 4 --round_num 4 --log_blowup 2 --vendor plonky3 --check_results
```

## On Intel i9-13900K

| Exponent | Tachyon | Plonky3 |
| :------- | ----------- | ------- |
| 18 | **2.97871** | 3.73433 |
| 19 | **5.76021** | 7.22556 |
| 20 | **11.2744** | 14.3306 |
| 21 | **22.5167** | 28.8935 |
| 22 | **47.6511** | 58.5402 |
| 18 | **1.59124** | 2.36518 |
| 19 | **2.87866** | 4.65791 |
| 20 | **6.06711** | 9.5114 |
| 21 | **12.1177** | 19.0475 |
| 22 | **24.4839** | 38.4716 |

![image](/benchmark/fri/fri_benchmark_ubuntu_i9.png)

## On Mac M3 Pro

WARNING: On Mac M3, high degree tests are not feasible due to memory constraints.

| Exponent | Tachyon | Plonky3 |
| :------- | ------- | ------------ |
| 18 | 3.68509 | **1.39107** |
| 19 | 7.37079 | **2.76483** |
| 20 | 14.9081 | **5.62375** |
| 21 | 30.3153 | **11.8295** |
| 22 | 64.8022 | **25.4490** |
| Exponent | Tachyon | Plonky3 |
| :------- | ------- | ------- |
| 18 | 3.96588 | 2.92354 |
| 19 | 7.95329 | 5.89079 |
| 20 | 15.8636 | 11.8225 |
| 21 | 46.1967 | 34.4965 |
| 22 | 182.084 | 124.7 |

![image](/benchmark/fri/fri_benchmark_mac_m3.png)
Binary file modified benchmark/fri/fri_benchmark_mac_m3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified benchmark/fri/fri_benchmark_ubuntu_i9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading