Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[merkledb benchmark] implement simple write profile benchmark #3372

Open
wants to merge 44 commits into
base: master
Choose a base branch
from

Conversation

tsachiherman
Copy link
Contributor

Why this should be merged

How this works

How this was tested

)

func getMerkleDBConfig(promRegistry prometheus.Registerer) merkledb.Config {
const defaultHistoryLength = 300
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Hasher: merkledb.DefaultHasher,
RootGenConcurrency: 0,
HistoryLength: defaultHistoryLength,
ValueNodeCacheSize: units.MiB,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem really small. If I am reading this correctly, there is about 2Mb of total cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted to tweak this value, but it had no performance impact.

}

fmt.Printf("Initializing database.")
ticksCh := make(chan interface{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do you really need this ticker? Might be easier to report every 100k rows or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reporting every 100k rows doesn't work nicely because of the batch writing ( which blocks for a long time ).


const (
defaultDatabaseEntries = 2000000
databaseCreationBatchSize = 1000000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch size is supposed to be 10k. This is 1M.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

return err
}
}
deleteDuration = time.Since(startDeleteTime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid all this math and just report the raw number of deletes. Grafana can convert this to a rate for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added both. I believe that my calculation would be more accurate, but let's have both for the time being.

Comment on lines +60 to +79
deleteRate = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "merkledb_bench",
Name: "entry_delete_rate",
Help: "The rate at which elements are deleted",
})
updateRate = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "merkledb_bench",
Name: "entry_update_rate",
Help: "The rate at which elements are updated",
})
insertRate = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "merkledb_bench",
Name: "entry_insert_rate",
Help: "The rate at which elements are inserted",
})
batchWriteRate = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "merkledb_bench",
Name: "batch_write_rate",
Help: "The rate at which the batch was written",
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be calculating the rates in the benchmark. The prometheus server should do this based on the counts.

Suggested change
deleteRate = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "merkledb_bench",
Name: "entry_delete_rate",
Help: "The rate at which elements are deleted",
})
updateRate = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "merkledb_bench",
Name: "entry_update_rate",
Help: "The rate at which elements are updated",
})
insertRate = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "merkledb_bench",
Name: "entry_insert_rate",
Help: "The rate at which elements are inserted",
})
batchWriteRate = prometheus.NewGauge(prometheus.GaugeOpts{
Namespace: "merkledb_bench",
Name: "batch_write_rate",
Help: "The rate at which the batch was written",
})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that it won't generate accurate results, since we're mixing batch writing and put in the same sequence.
I've included both the counter and the rate metrics so that we can get both numbers in the grafana.

Comment on lines 214 to 223
err = mdb.Close()
if err != nil {
fmt.Fprintf(os.Stderr, "unable to close levelDB database : %v\n", err)
return err
}
err = levelDB.Close()
if err != nil {
fmt.Fprintf(os.Stderr, "unable to close merkleDB database : %v\n", err)
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logs seem inverted here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Comment on lines +90 to +91
ValueNodeCacheSize: units.MiB,
IntermediateNodeCacheSize: 1024 * units.MiB,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much memory are we using? Feels like we could probably increase these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've attempted to tweak these, but haven't seen any concrete gains. Adjusting the leveldb config was helpful, though.

startUpdateTime := time.Now()
for keyToUpdateIdx := low + ((*databaseEntries) / 2); keyToUpdateIdx < low+((*databaseEntries)/2)+databaseRunningUpdateSize; keyToUpdateIdx++ {
updateEntryKey := calculateIndexEncoding(keyToUpdateIdx)
updateEntryValue := calculateIndexEncoding(keyToUpdateIdx - ((*databaseEntries) / 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, should be:

Suggested change
updateEntryValue := calculateIndexEncoding(keyToUpdateIdx - ((*databaseEntries) / 2))
updateEntryValue := calculateIndexEncoding(low)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. I think that it's ok to use the low as you suggested, although using the above would yield different and unique values ( i.e. [low..low+5k] ).

levelDB.Close()
}()

low := uint64(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low never changes, and should be increased by 2.5k each pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch; fixed.

Copy link

This PR has become stale because it has been open for 30 days with no activity. Adding the lifecycle/frozen label will cause this PR to ignore lifecycle events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants