-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove mutex in prefixdb #239
Conversation
Codecov Report
@@ Coverage Diff @@
## master #239 +/- ##
==========================================
- Coverage 68.54% 68.03% -0.51%
==========================================
Files 27 27
Lines 2130 2093 -37
==========================================
- Hits 1460 1424 -36
+ Misses 595 594 -1
Partials 75 75
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In light of the argument in #156, this seems like it should be safe.
I am a bit worried that we don't seem to have any tests that exercise the documented concurrency-safety requirement, though -- and there is at least one case where the implementations differ about what operations are allowed to overlap (cf. https://github.com/tendermint/tm-db/blob/master/util_test.go#L29).
Running the race detector can't really help, when there is no concurrency for it to check. This PR doesn't create that problem, but it does make it more relevant.
I may need to nuke-reinstall my test machine, but I did a bit of a survey on databases recently. That led me to the terra-money/tmdb:performance branch. The speed is super impressive but I've had an explosion (concurrent map read and write error). Can you review my method for testing this stuff? I am thinking like:
My chief concern is sync times. They are too long and this slows down all other work in and on cosmos. But if we don't pay careful attention to this stuff, we can miss things-- for example there's a state breaking change somewhere in the 42 series. (you'll note I didn't mention where-- as I don't know :P ) But how do I know it's there? I'm sure it's there because I applied tendermint 34.19 and sdk v42.11 to gaia and could not do a genesis sync. |
PS: https://github.com/cosmos/gaia/runs/6163960772?check_suite_focus=true#step:8:31 codecov generally broken |
I'm not convinced we need anything that elaborate. Even just a regular Go test that creates a database instance and does some amount of parallel insertions, lookups, and deletions would give the race detector something to push against. I don't think we need an elaborate multi-process test harness to verify the basic concurrency properties the library is intended to guarantee. Of course, integration tests are nice too, but as far as tm-db itself, integration is a concern for the clients of the library. Very roughly, I was thinking along these lines: // Run generates concurrent reads and writes to db so the race detector can
// verify concurrent operations are properly synchronized.
// The contents of db are garbage after Run returns.
func Run(t *testing.T, db tmdb.DB) {
t.Helper()
const numWorkers = 10
const numKeys = 32
var wg sync.WaitGroup
for i := 0; i < numWorkers; i++ {
wg.Add(1)
i := i
go func() {
defer wg.Done()
// Insert a bunch of keys with random data.
for k := 1; k <= numKeys; k++ {
key := taskKey(i, k) // say, "task-<i>-key-<k>"
value := someRandomValue()
if err := db.Set(key, value); err != nil {
t.Errorf("Task %d: db.Set(%q=%q) failed: %v",
i, string(key), string(value), err)
}
}
// Iterate over the database to make sure our keys are there.
it, err := db.Iterator(nil, nil)
if err != nil {
t.Errorf("Iterator[%d]: %v", i, err)
return
}
found := make(map[string][]byte)
mine := []byte(fmt.Sprintf("task-%d-", i))
for it.Valid() {
it.Next()
if key := it.Key(); bytes.HasPrefix(key, mine) {
found[string(key)] = it.Value()
}
}
if err := it.Error(); err != nil {
t.Errorf("Iterator[%d] reported error: %v" i, err)
}
if err := it.Close(); err != nil {
t.Errorf("Close iterator[%d]: %v", i, err)
}
if len(mine) != numKeys {
t.Errorf("Task %d: found %d keys, wanted %d", i, len(mine), numKeys)
}
// Delete all the keys we inserted.
for key := range mine {
if err := db.Delete([]byte(key)); err != nil {
t.Errorf("Delete %q: %v", key, err)
}
}
}()
}
wg.Wait()
} N.B. this is not tested, but should be the right general flavour. |
@creachadair thank you! it might even help me with the PebbleDB implementation |
I've added .gitpod.yml to this @creachadair if you wouldn't mind trying the gitpod browser extension, I think you'd instantly understand why it is a nice-to-have. https://www.gitpod.io/docs/browser-extension Just saved me a whole bunch of time. I was able to work on the linter issues in the same environment as CI, and didn't need to think about my development environment. |
…tr.isInvalid were considered to be ineffective assignments by the linter.
That was dramatically easier than making a dev env, and now it's all set :D! |
I think this one is set now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty close, there are just a few details to sort out.
Co-authored-by: M. J. Fromberger <[email protected]>
Co-authored-by: M. J. Fromberger <[email protected]>
Co-authored-by: M. J. Fromberger <[email protected]>
@creachadair I hope to have the tests moved today, wish me luck... |
is this wanted? |
id vote for yes |
generally the scope should be fixed on a single item in prs. Expanding scope makes it take longer to get things merged. |
closes:
#156