forky: benchmark badger vs forky #2143

jmozah · 2020-03-24T11:05:06Z

This PR is a replacement of fcds most latest branch with badger to do benchmarkings.
Below is the benchmarking for writing, reading, deleting 1 million chunks in seconds

                                                 Badger           fcds-tenage-mutants
-----------------------------------------+--------------------+---------------------+
BenchmarkWriteOverClean                       17.2                 37.7               
-----------------------------------------+--------------------+---------------------+
BenchmarkWriteOver1Million                    20.2                46.1              
-----------------------------------------+--------------------+---------------------+
BenchmarkReadOver1Million                     5.1                  31.4               
-----------------------------------------+--------------------+---------------------+
BenchmarkDeleteOver1Million                  9.5                  25.0
-----------------------------------------+--------------------+---------------------+
BenchmarkWriteReadOver1Million                59.1 / 29.1         99.8 / 46.4
-----------------------------------------+--------------------+---------------------+
BenchmarkWriteReadDeleteOver1Million      91.7 / 38.3 / 119.4  122.5 / 70.8 / 110.5 
-----------------------------------------+--------------------+---------------------+

janos · 2020-03-24T14:37:53Z

Can you send instructions how you managed to run benchmarks that you referenced in the description on fcds-tenage-mutants branch? These benchmarks are not existing there, and a lot of testing code is changed in this pr.

I think that it is important for a reviewer to be able to reproduce the results you presented.

janos · 2020-03-24T14:41:17Z

Could you make this code buildable? It fails with the following errors when running go run build/ci.go install:

go run build/ci.go install                                                                                                  (git)-[fcds-badger] 
util.go:169: package listing failed: exit status 1
go: inconsistent vendoring in /Users/janos/go/src/github.com/ethersphere/swarm:
        github.com/golang/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/golang/[email protected]
        github.com/pkg/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/pkg/[email protected]
        golang.org/x/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates golang.org/x/[email protected]
        golang.org/x/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates golang.org/x/[email protected]

run 'go mod vendor' to sync, or use -mod=mod or -mod=readonly to ignore the vendor directory

storage/localstore/mode_put.go:146:14: assignment mismatch: 2 variables but db.data.Put returns 1 values

jmozah · 2020-03-25T06:35:10Z

Can you send instructions how you managed to run benchmarks that you referenced in the description on fcds-tenage-mutants branch?

cd to storage/fcds/leveldb directory
go test -bench=. -timeout=60m

For every test the benchmark is there for 10,000, 100k and 1Million. I have commented 10K and 100k tests for brevity. If you want enable them too and run them to see the performance for smaller sets.

jmozah · 2020-03-25T06:40:26Z

go: inconsistent vendoring in /Users/janos/go/src/github.com/ethersphere/swarm:
github.com/golang/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/golang/[email protected]
github.com/pkg/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/pkg/[email protected]
golang.org/x/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates golang.org/x/[email protected]
golang.org/x/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates golang.org/x/[email protected]

You have to do a go get -u github.com/dgraph-io/badger for getting all the dependency packages of badger in to your vendor to get rid of this errors

storage/localstore/mode_put.go:146:14: assignment mismatch: 2 variables but db.data.Put returns 1 values

Fixed it

jmozah · 2020-03-25T06:58:22Z

@janos This PR is just for benchmarking and not for merging. So please ignore the conflicts.

janos · 2020-03-25T09:53:15Z

cd to storage/fcds/leveldb directory
go test -bench=. -timeout=60m

For every test the benchmark is there for 10,000, 100k and 1Million. I have commented 10K and 100k tests for brevity. If you want enable them too and run them to see the performance for smaller sets.

@jmozah

in fcds-teenage-mutants branch there are just no Benchmarks in storage/fcds/leveldb directory, and I cannot find these benchmarks in any other directory
in fcds-badger there is no storage/fcds/leveldb directory everything is removed and benchmarks are in storage/fcds

I am sorry, but this PR is not reviewable and benchmarks are not reproducible without your direct assistance.

If now can say that I trust your measurements even if I cannot reproduce them.

acud · 2020-03-25T13:07:30Z

It is also unclear to me how the benchmarks were done. It's very difficult to compare head to head as usually is with benchmarks (check-out different branches and run the same benchmarks in the same directory).

Also, from the benchmarks that I ran on the branch I saw that the actual benchmark analysis of operation per timeframe is not used but some other CLI printouts say how long it took to insert/read/do some operation. I'm guessing you just divided that by the number of chunks that you were measuring within that run (this was also evident since the results you've pasted were not in the golang benchmark tool output format). Having the benchmark tool measure how long per operation has more significance in my opinion.

jmozah · 2020-03-25T14:03:14Z

in fcds-teenage-mutants branch there are just no Benchmarks in storage/fcds/leveldb directory, and I cannot find these benchmarks in any other directory

oops.. My mistake. I pushed the benchmarks in fcds-teenage-mutants branch now. Please check.

in fcds-badger there is no storage/fcds/leveldb directory everything is removed and benchmarks are in storage/fcds

Yes, You don't need any of the leveldb stuff you used to store meta. Badger stores both meta and chunks in different places similar to your forky implementation. Thats why you find only badger related stuff and i have removed everything related to forky.

The benchmarks (like in fcds-teenage-mutants) is there in fcds-test.go file. This benchmark file is exactly same to the one in fcds-teenage-mutants branch benchmark file.

janos · 2020-03-25T14:07:38Z

Thanks @jmozah. It would be very nice that all required code and instructions have been shared so that reviewers do not waste time figuring out how and why something is measured or not working.

jmozah · 2020-03-25T14:12:07Z

It is also unclear to me how the benchmarks were done. It's very difficult to compare head to head as usually is with benchmarks (check-out different branches and run the same benchmarks in the same directory).

I am not sure i understand this. I checked out 2 branches ran the same benchmarks on them.

Also, from the benchmarks that I ran on the branch I saw that the actual benchmark analysis of operation per timeframe is not used but some other CLI printouts say how long it took to insert/read/do some operation. I'm guessing you just divided that by the number of chunks that you were measuring within that run (this was also evident since the results you've pasted were not in the golang benchmark tool output format). Having the benchmark tool measure how long per operation has more significance in my opinion.

The benchmarks contains other prepping items like adding base 1 Million items before starting the benchmark. Since i wanted to avoid skewing of the benchmarks by those preps, i calculated the time myself.

BTW, The actual benchmark tool also outputs the time per operation and you can check that too.

acud · 2020-03-25T19:30:57Z

The benchmarks contains other prepping items like adding base 1 Million items before starting the benchmark. Since i wanted to avoid skewing of the benchmarks by those preps, i calculated the time myself.

Having looked briefly at the benchmarks - the measurement now includes the setup stage, this should be mitigated by stopping and starting the benchmark timer again after the setup stage.

acud · 2020-03-26T14:20:55Z

how to benchmark:
checkout the fcds-badger branch. go to the fcds directory and run go test -bench .

compare to latest forky on forky-teenage-mutants:
checkout the fcds-teenage-mutants, go to fcds/leveldb directory and run the benchmarks using the same command.

measurements were done on a general purpose digital ocean droplet with 32gb ram and 100gb ssd that demonstrated a steady throughput of 1 GB per second when executing dd if=/dev/zero oflags=direct of=/tmp/test
results:

fcds (original branch):

goos: linux
goarch: amd64
pkg: github.com/ethersphere/swarm/storage/fcds/leveldb
BenchmarkWrite/baseline_10000/add_10000-8                      5         207771471 ns/op
BenchmarkWrite/baseline_10000/add_20000-8                      3         390771814 ns/op
BenchmarkWrite/baseline_10000/add_50000-8                      1        1082049034 ns/op
BenchmarkWrite/baseline_10000/add_100000-8                     1        2522677795 ns/op
BenchmarkWrite/baseline_100000/add_10000-8                     4         284383422 ns/op
BenchmarkWrite/baseline_100000/add_20000-8                     2         534588746 ns/op
BenchmarkWrite/baseline_100000/add_50000-8                     1        1419511730 ns/op
BenchmarkWrite/baseline_100000/add_100000-8                    1        2991614321 ns/op
BenchmarkWrite/baseline_1000000/add_10000-8                    3         351387130 ns/op
BenchmarkWrite/baseline_1000000/add_20000-8                    2        1244932378 ns/op
BenchmarkWrite/baseline_1000000/add_50000-8                    1        1912965611 ns/op
BenchmarkWrite/baseline_1000000/add_100000-8                   1        3549441460 ns/op
BenchmarkRead/baseline_10000/read_10000-8                     49          23632382 ns/op
BenchmarkRead/baseline_100000/read_10000-8                    28          45459948 ns/op
BenchmarkRead/baseline_100000/read_100000-8                    3         355956945 ns/op
BenchmarkRead/baseline_1000000/read_10000-8                   22          47685787 ns/op
BenchmarkRead/baseline_1000000/read_100000-8                   3         467433824 ns/op
BenchmarkRead/baseline_1000000/read_1000000-8                  1        5064705218 ns/op
PASS
ok      github.com/ethersphere/swarm/storage/fcds/leveldb       483.645s

fcds-badger:

go: finding github.com/dustin/go-humanize v1.0.0
goos: linux
goarch: amd64
pkg: github.com/ethersphere/swarm/storage/fcds
BenchmarkWrite/baseline_10000/add_10000-8         	       5	 218338108 ns/op
BenchmarkWrite/baseline_10000/add_20000-8         	       2	 542248470 ns/op
BenchmarkWrite/baseline_10000/add_50000-8         	       1	1385523605 ns/op
BenchmarkWrite/baseline_10000/add_100000-8        	       1	2683787806 ns/op
BenchmarkWrite/baseline_100000/add_10000-8        	       4	 332439304 ns/op
BenchmarkWrite/baseline_100000/add_20000-8        	       2	 519824904 ns/op
BenchmarkWrite/baseline_100000/add_50000-8        	       1	1261284643 ns/op
BenchmarkWrite/baseline_100000/add_100000-8       	       1	2652950044 ns/op
BenchmarkWrite/baseline_1000000/add_10000-8       	       4	 316898512 ns/op
BenchmarkWrite/baseline_1000000/add_20000-8       	       2	 513771005 ns/op
BenchmarkWrite/baseline_1000000/add_50000-8       	       1	1185518167 ns/op
BenchmarkWrite/baseline_1000000/add_100000-8      	       1	2617738577 ns/op
BenchmarkRead/baseline_10000/read_10000-8         	      30	  37316037 ns/op
BenchmarkRead/baseline_100000/read_10000-8        	      21	  53231185 ns/op
BenchmarkRead/baseline_100000/read_100000-8       	       2	 547170612 ns/op
BenchmarkRead/baseline_1000000/read_10000-8       	      18	  64539537 ns/op
BenchmarkRead/baseline_1000000/read_100000-8      	       2	 659616124 ns/op
BenchmarkRead/baseline_1000000/read_1000000-8     	       1	7065391462 ns/op
PASS
ok  	github.com/ethersphere/swarm/storage/fcds	478.670s

fcds-teenage-mutants:

goos: linux
goarch: amd64
pkg: github.com/ethersphere/swarm/storage/fcds/leveldb
BenchmarkWrite/baseline_10000/add_10000-8         	       4	 352686643 ns/op
BenchmarkWrite/baseline_10000/add_20000-8         	       2	 720940594 ns/op
BenchmarkWrite/baseline_10000/add_50000-8         	       1	1587398136 ns/op
BenchmarkWrite/baseline_10000/add_100000-8        	       1	3548349808 ns/op
BenchmarkWrite/baseline_100000/add_10000-8        	       3	 409889821 ns/op
BenchmarkWrite/baseline_100000/add_20000-8        	       2	 801824238 ns/op
BenchmarkWrite/baseline_100000/add_50000-8        	       1	2071715877 ns/op
BenchmarkWrite/baseline_100000/add_100000-8       	       1	3826236892 ns/op
BenchmarkWrite/baseline_1000000/add_10000-8       	       3	 411788621 ns/op
BenchmarkWrite/baseline_1000000/add_20000-8       	       2	 749673024 ns/op
BenchmarkWrite/baseline_1000000/add_50000-8       	       1	2118999953 ns/op
BenchmarkWrite/baseline_1000000/add_100000-8      	       1	4211655511 ns/op
BenchmarkRead/baseline_10000/read_10000-8         	      69	  24283470 ns/op
BenchmarkRead/baseline_100000/read_10000-8        	      36	  38975640 ns/op
BenchmarkRead/baseline_100000/read_100000-8       	       3	 343036374 ns/op
BenchmarkRead/baseline_1000000/read_10000-8       	      15	  73963726 ns/op
BenchmarkRead/baseline_1000000/read_100000-8      	       2	 845639862 ns/op
BenchmarkRead/baseline_1000000/read_1000000-8     	       1	8387259867 ns/op
PASS
ok  	github.com/ethersphere/swarm/storage/fcds/leveldb	606.165s

replaced forky wit badger

ad3f675

jmozah changed the base branch from fcds-teenage-mutants to master March 24, 2020 11:05

jmozah changed the base branch from master to fcds-teenage-mutants March 24, 2020 11:05

jmozah self-assigned this Mar 24, 2020

jmozah requested review from janos, acud and zelig March 24, 2020 11:31

Fix build errors

062460d

remove debug printfs

0a716ee

acud changed the title ~~replaced forky-teenage-mutants with badger~~ forky: benchmark badger vs forky Mar 25, 2020

jmozah and others added 8 commits March 26, 2020 13:34

Ignoring setup stage in benchmark timing

7e8a84c

benchmark changes

e665acb

make ctor same name

fe59f1f

improve benchmarks

85f8b9f

make comparable to other branch

9c6b0da

change to 3mil chunks (12 gigs)

631bff2

make comparable

6a69a1e

minor refinements

1e3cbbb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

forky: benchmark badger vs forky #2143

forky: benchmark badger vs forky #2143

jmozah commented Mar 24, 2020 •

edited

Loading

janos commented Mar 24, 2020

janos commented Mar 24, 2020

jmozah commented Mar 25, 2020 •

edited

Loading

jmozah commented Mar 25, 2020

jmozah commented Mar 25, 2020

janos commented Mar 25, 2020

acud commented Mar 25, 2020 •

edited

Loading

jmozah commented Mar 25, 2020

janos commented Mar 25, 2020 •

edited

Loading

jmozah commented Mar 25, 2020

acud commented Mar 25, 2020

acud commented Mar 26, 2020 •

edited

Loading

forky: benchmark badger vs forky #2143

Are you sure you want to change the base?

forky: benchmark badger vs forky #2143

Conversation

jmozah commented Mar 24, 2020 • edited Loading

janos commented Mar 24, 2020

janos commented Mar 24, 2020

jmozah commented Mar 25, 2020 • edited Loading

jmozah commented Mar 25, 2020

jmozah commented Mar 25, 2020

janos commented Mar 25, 2020

acud commented Mar 25, 2020 • edited Loading

jmozah commented Mar 25, 2020

janos commented Mar 25, 2020 • edited Loading

jmozah commented Mar 25, 2020

acud commented Mar 25, 2020

acud commented Mar 26, 2020 • edited Loading

jmozah commented Mar 24, 2020 •

edited

Loading

jmozah commented Mar 25, 2020 •

edited

Loading

acud commented Mar 25, 2020 •

edited

Loading

janos commented Mar 25, 2020 •

edited

Loading

acud commented Mar 26, 2020 •

edited

Loading