-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
accounts-db: send batches of accounts to the background hasher #810
Conversation
So in bench-tps, the accounts are very small which might be biasing the results here. Any idea what this might look like wiht larger accounts under normal validator operations? Also, do you know how often we are sending these batches? does it get any queueing behavior where while we are working on a batch there is another now batch in the channel? If so, wonder the effect of doing a slightly more complicated loop structure on recv side, similar to something we do for new banking stage scheduler: loop {
let batch = receiver.recv()?; // blocking receive call
// chain the initially received batch with non-blocking calls to try_recv
// can do this manually, or with one-liner below.
for batch in std::iter::once(batch).chain(receiver.try_iter()) {
// ... hashing
}
} If we are receiving multiple batches at similar times this may eliminate some additional sleeps? |
I haven't been able to profile banking of a mnb validator yet, the staked devboxes get a slot every 8+ hrs so can't be really used for profiling. I'm hoping to get access to a juicier validator sometime this week.
Yeah that's the hope: we send one account batch for each executed tx batch, so many of them should be pretty close in time.
As discussed on discord yes conceptually this is a good idea, and in fact crossbeam does it internally already to try and save the syscall cost. |
Yeah. Saw the similar pattern and probably jumped the gun with my suggestion. After looking at the code I was referring there was some additional context I was keeping only when we had back-to-back try_recvs instead of re-getting it every recv call. That doesn't exist here, so the suggestion is indeed pointless. |
Instead of sending accounts individually, send batches of accounts for background hashing. Before this change we used to send accounts individually, and the background thread used to do: loop { let account = receiver.recv(); account.hash(); // go back to sleep in recv() } Because most accounts are small and hashing them is very fast, the background thread used to sleep a lot, and required many syscalls from the sender in order to be woken up. Batching reduces the number of syscalls.
16d258d
to
2ce3324
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #810 +/- ##
===========================================
+ Coverage 0 81.9% +81.9%
===========================================
Files 0 853 +853
Lines 0 231781 +231781
===========================================
+ Hits 0 189855 +189855
- Misses 0 41926 +41926 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Yes, this looks amazing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Post-merge review: lgtm, thanks!
Oops, sorry about that, no clue your PR existed. Personally i can’t really keep track of GitHub notifications, so if your PR goes ignored for more than a few days, I recommend nagging people on discord until you get some kind of acknowledgement. |
all good. no worries. got another PR coming this weekend at the request of someone at fndn |
glad this got in |
Instead of sending accounts individually, send batches of accounts for background hashing.
Before this change we used to send accounts individually, and the background thread used to do:
Because most accounts are small and hashing them is very fast, the background thread used to sleep a lot, and required many syscalls from the sender in order to be woken up.
Batching reduces the number of syscalls.