Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: expose BlockKeyCacheSize and enable WriteThrough datastore options #10614

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

hsanjuan
Copy link
Contributor

@hsanjuan hsanjuan commented Dec 6, 2024

This enables WriteThrough blockstore and blockservice by default, exposing a new option and BlockKeyCacheSize to control two-queue cache and be able to disable it.

@hsanjuan hsanjuan self-assigned this Dec 6, 2024
@hsanjuan hsanjuan requested a review from a team as a code owner December 6, 2024 10:38
@hsanjuan
Copy link
Contributor Author

hsanjuan commented Dec 6, 2024

We can discuss a bit, because I am not sure when a non-writethrough blockstore/service makes fully sense. Probably on small datastores with very well primed caches that fit all keys and when re-writing the same keys all the time (not sure how common that is to warrant being the current default).

I don't think non-WriteThrough makes sense without bloom-filter. And bloom filter is primed on boot, which triggers reading all the keys (horrible in very large datastores).

  • flatfs: I'm guessing the OS provides some caching for folders etc. but Has() calls trigger 2-levels of directory listing on every block-write. Might be better than writing the block IF the block is already there.
  • Pebble/Badger: Given they can have 30x read amplification, I am not sure it ever makes sense to touch the read-path at all. They can optimize writes internally as they wish.

@hsanjuan hsanjuan force-pushed the fix/perf-pebble-improvements branch from 7f53a7d to cd96506 Compare December 6, 2024 10:51
core/coreapi/coreapi.go Outdated Show resolved Hide resolved
docs/changelogs/v0.33.md Outdated Show resolved Hide resolved
docs/config.md Outdated Show resolved Hide resolved
docs/config.md Outdated Show resolved Hide resolved
core/commands/dag/import.go Outdated Show resolved Hide resolved
@lidel lidel mentioned this pull request Dec 6, 2024
40 tasks
@hsanjuan hsanjuan force-pushed the fix/perf-pebble-improvements branch from 35cbba2 to b4b028c Compare December 10, 2024 11:23
@hsanjuan hsanjuan changed the title Feat: expose BlockKeyCacheSize and enable WriteThrough when bloom filter disabled Feat: expose BlockKeyCacheSize and enable WriteThrough datastore options Dec 10, 2024
@gammazero gammazero self-requested a review December 10, 2024 18:00
Comment on lines +247 to +251
var bsopts []bserv.Option
if cfg.Datastore.WriteThrough {
bsopts = append(bsopts, bserv.WriteThrough())
}
subAPI.blocks = bserv.New(subAPI.blockstore, subAPI.exchange, bsopts...)
Copy link
Contributor

@gammazero gammazero Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok, but is might be nicer to have a blockservice option WithWriteThrough that takes a bool. Then code becomes:

subAPI.blocks = bserv.New(subAPI.blockstore, subAPI.exchange,
    bserv.WithWriteThrough(cfg.Datastore.WriteThrough))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right? We need to change that in boxo. I can take care, but we can merge this and fix that later for next boxo release.

HashOnRead bool
BloomFilterSize int
BlockKeyCacheSize OptionalInteger `json:",omitempty"`
WriteThrough bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hsanjuan are you planning to write a migration for existing users (who don't have this in config), or was it intentional to keep existing users stuck with false?

Both feel like a potential headache :)

Perhaps we should switch this from bool to Flag would allow us to control implicit default better, without having to write migration for existing users?

Suggested change
WriteThrough bool
WriteThrough Flag `json:",omitempty"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the option is not present, it will take the default which is true? Or you mean the config-parsing code does not know if its false or unset?

// without performing any reads to check if the incoming blocks are
// already present in the datastore. Enable for datastores with fast
// writes and slower reads.
DefaultWriteThrough = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(if we switch to Flag)

Suggested change
DefaultWriteThrough = true
DefaultWriteThrough = True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants