Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Peapod #2462

Merged
merged 6 commits into from
Aug 15, 2023
Merged

Conversation

cthulhu-rider
Copy link
Contributor

@cthulhu-rider cthulhu-rider commented Jul 29, 2023

P.S.: we haven't finally decided name yet, but currently peapod wins #2453 (comment)

TODO:

  • Delete
  • Exists
  • Iterate

i didn't precisely try to optimize introduced implementation cuz tested against existing alternatives only. Current benchmark results against single BoltDB with Update and Batch methods:

BenchmarkPut/update-size=1-thread=1-8                     100          13390404 ns/op           18830 B/op        107 allocs/op
BenchmarkPut/batch-size=1-thread=1-8                       54          21120972 ns/op           25333 B/op        148 allocs/op
BenchmarkPut/smart-size=1-thread=1-8                      100          11687289 ns/op           13838 B/op         83 allocs/op
BenchmarkPut/update-size=1-thread=20-8                      4         268907087 ns/op          404892 B/op       2166 allocs/op
BenchmarkPut/batch-size=1-thread=20-8                      42          26277377 ns/op          147620 B/op        768 allocs/op
BenchmarkPut/smart-size=1-thread=20-8                      70          17733068 ns/op          114877 B/op        649 allocs/op
BenchmarkPut/update-size=1-thread=100-8                     1        1357502126 ns/op         2270376 B/op      11766 allocs/op
BenchmarkPut/batch-size=1-thread=100-8                     39          31524376 ns/op          681342 B/op       4010 allocs/op
BenchmarkPut/smart-size=1-thread=100-8                     55          21419636 ns/op          659725 B/op       3746 allocs/op
BenchmarkPut/update-size=1024-thread=1-8                  100          13624444 ns/op           24353 B/op        152 allocs/op
BenchmarkPut/batch-size=1024-thread=1-8                    38          29367535 ns/op           36609 B/op        233 allocs/op
BenchmarkPut/smart-size=1024-thread=1-8                   100          13456244 ns/op           21111 B/op        135 allocs/op
BenchmarkPut/update-size=1024-thread=20-8                   3         442465860 ns/op          529232 B/op       3637 allocs/op
BenchmarkPut/batch-size=1024-thread=20-8                   34          35019898 ns/op          199009 B/op       1226 allocs/op
BenchmarkPut/smart-size=1024-thread=20-8                   67          22541568 ns/op          189971 B/op       1108 allocs/op
BenchmarkPut/update-size=1024-thread=100-8                  1        1692022295 ns/op         2679576 B/op      18032 allocs/op
BenchmarkPut/batch-size=1024-thread=100-8                  36          33703418 ns/op          840072 B/op       4796 allocs/op
BenchmarkPut/smart-size=1024-thread=100-8                  51          21807732 ns/op          838428 B/op       4623 allocs/op
BenchmarkPut/update-size=102400-thread=1-8                 62          18693342 ns/op          141131 B/op        192 allocs/op
BenchmarkPut/batch-size=102400-thread=1-8                  48          27474011 ns/op          146738 B/op        234 allocs/op
BenchmarkPut/smart-size=102400-thread=1-8                  64          19075746 ns/op          133554 B/op        167 allocs/op
BenchmarkPut/update-size=102400-thread=20-8                 3         351594258 ns/op         2770773 B/op       3706 allocs/op
BenchmarkPut/batch-size=102400-thread=20-8                 33          37157903 ns/op         2482699 B/op       1340 allocs/op
BenchmarkPut/smart-size=102400-thread=20-8                 50          33805834 ns/op         2481669 B/op       1263 allocs/op
BenchmarkPut/update-size=102400-thread=100-8                1        1999296188 ns/op        13860160 B/op      18681 allocs/op
BenchmarkPut/batch-size=102400-thread=100-8                12         116099877 ns/op        11770452 B/op       5031 allocs/op
BenchmarkPut/smart-size=102400-thread=100-8                12          85920773 ns/op        12766625 B/op       5355 allocs/op

@codecov
Copy link

codecov bot commented Jul 29, 2023

Codecov Report

Merging #2462 (fb1bcac) into master (86c5d3b) will increase coverage by 0.35%.
The diff coverage is 68.61%.

❗ Current head fb1bcac differs from pull request most recent head b01baf6. Consider uploading reports for the commit b01baf6 to get more accurate results

@@            Coverage Diff             @@
##           master    #2462      +/-   ##
==========================================
+ Coverage   29.30%   29.65%   +0.35%     
==========================================
  Files         399      401       +2     
  Lines       30347    30620     +273     
==========================================
+ Hits         8892     9081     +189     
- Misses      20719    20773      +54     
- Partials      736      766      +30     
Files Changed Coverage Δ
cmd/neofs-node/config.go 0.00% <0.00%> (ø)
...kg/local_object_storage/blobstor/common/storage.go 52.94% <52.94%> (ø)
pkg/local_object_storage/blobstor/peapod/peapod.go 73.47% <73.47%> (ø)
cmd/neofs-node/validate.go 45.83% <100.00%> (ø)

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

pkg/local_object_storage/peapod/put.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/put.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/put.go Outdated Show resolved Hide resolved
err := x.bolt.View(func(tx *bbolt.Tx) error {
bktRoot := tx.Bucket(rootBucket)
if bktRoot == nil {
return fmt.Errorf("%w: missing root bucket", apistatus.ErrObjectNotFound)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use fmt.Errorf("some text: %w", err) pattern. In this function exists two different patterns for errors, with %w in the end and in the beginning.

Is it an exceptional situation to change this format here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually reason goes after colon, and here missing root bucket is a particular reason of object not found (there may be other reasons). Here we could just return apistatus.ErrObjectNotFound, but additional context is good.

u may encounter similar cases in the code base

func (x *Peapod) flushLoop() {
defer close(x.chFlushDone)

ticker := time.NewTicker(10 * time.Millisecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan to adjust the value from the external config? I think even no, this is a magic const here, it would be better to put it in the const section above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now it doesn't make sense to expose this interval as config because we don't have strict model of how it changes the performance (i reached the current value experimentally). Having dedicated const is not bad but it'll be used in single place, i don't mind

@cthulhu-rider cthulhu-rider force-pushed the feature/2453-new-bbcz branch 2 times, most recently from 29f76c4 to 6c50e19 Compare July 31, 2023 16:00
@cthulhu-rider
Copy link
Contributor Author

cthulhu-rider commented Jul 31, 2023

added all ops and made Peapod Blobstor-compatible

  • support in storage node app
  • implement util to migrate blobovnicza tree into peapod (incl. storage ID refactor in metabase)

@@ -0,0 +1,70 @@
package peapod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the files are named read and write? usually we use put and get like the methods are

Copy link
Contributor Author

@cthulhu-rider cthulhu-rider Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

united methods using similar code in a separate source file, imo this is easier to maintain then per-method files. U think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as always, i am for unification, so i would like it to be the same for every common.Storage implementation

talking about naming: not that critical but i still think that method per file is preferable. files' number is not that big but more details are always welcome (once again, it there are not a lot of details, num of methods not gonna exceed 20, i think)

Copy link
Contributor Author

@cthulhu-rider cthulhu-rider Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then i'll need to add more files to place shared reusable methods like batch into and we'll come to file:func 1:1, i'd prefer to group related code in single source file, but don't mind to split (more minor changes taking time)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files should group some sets of code logically, one file per function doesn't help much, you get too many files quickly and then duplicate imports in all of them, don't remember which of them has some constant defined, switch between them in editor. Likely this is not the biggest problem we have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then lets leave introduced file structure as it is (it's shorter and unites similar code at least), we'll always be able to change things later in off season (i know we dont have one 😺). This wasnt a proposal "lets do like this everywhere from now", just new independent autonomous package

Copy link
Member

@carpawell carpawell Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one file per function doesn't help much

i wasn't talking about func per file. i was more about a single approach: we had 7 "storage for object data" packages that did Put in a file named put.go and now we are going to have a single pkg that does Put in a file named write.go. also, i am ok with both approaches but we have a little number of "API" (not so many operations with an object can be done) calls and are not going to "get too many files quickly"

pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
@cthulhu-rider cthulhu-rider force-pushed the feature/2453-new-bbcz branch 3 times, most recently from c0ea07b to 115c649 Compare August 2, 2023 12:16
@cthulhu-rider cthulhu-rider marked this pull request as ready for review August 2, 2023 12:16
pkg/local_object_storage/blobstor/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/blobstor/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/blobstor/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
return apistatus.ErrObjectNotFound
}

data = slice.Copy(val)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have already asked somewhere but did not get the answer: why do we keep importing that non-neo-go util from neo-go?

@roman-khimov

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe because it's so nice and solves the problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in next go upgrades we'll use bytes.Clone

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe because it's so nice and solves the problem?

Sure it does, but I am so excited about why do we need to import it in this repo and why everybody is OK with it. Extremely strange to me: simple util that can be done by everyone and it should take < 1m but we are keeping importing it. It does not relate neo-go absolutely.

Copy link
Contributor Author

@cthulhu-rider cthulhu-rider Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so iiuc u suggest either duplicate this in neofs-node or write 2 lines in-place or ..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carpawell so you don't mind using this feature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not a problem for me, i just do not get it. it won't stop me from pressing "Approve"

Copy link
Contributor Author

@cthulhu-rider cthulhu-rider Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it won't stop me from pressing "Approve"

i mean when bytes.Clone (or even slices.Clone) will become available with newer Go version - u'll be still against using those utilities?

https://cs.opensource.google/go/go/+/refs/tags/go1.21.0:src/bytes/bytes.go;drc=9768f736ea11165f10062401dec5509fdf1882ba;l=1365

Copy link
Member

@carpawell carpawell Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u'll be still against using those utilities?

nono, of course, no. "i need neo-go to copy bytes" sounds strange to me and that's it. using std for that purpose is just as natural as possible (when it is released)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that you need neo-go, it's that you already have it. Introducing it as a dependency just for this function would be inappropriate, but we're already dependent on it for different reasons, so reusing some functions is just a natural thing to do. And yes, it'll all be changed with newer Go.

pkg/local_object_storage/peapod/read.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/write_test.go Outdated Show resolved Hide resolved
@cthulhu-rider cthulhu-rider force-pushed the feature/2453-new-bbcz branch 2 times, most recently from ad22084 to d9b45a5 Compare August 9, 2023 08:35
Copy link
Member

@carpawell carpawell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflicts.

pkg/local_object_storage/blobstor/common/storage.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/peapod.go Outdated Show resolved Hide resolved
pkg/local_object_storage/peapod/write_test.go Outdated Show resolved Hide resolved
pkg/local_object_storage/blobstor/common/storage.go Outdated Show resolved Hide resolved
pkg/local_object_storage/blobstor/peapod/peapod.go Outdated Show resolved Hide resolved
cmd/neofs-node/config.go Outdated Show resolved Hide resolved
@cthulhu-rider cthulhu-rider force-pushed the feature/2453-new-bbcz branch 3 times, most recently from 0438519 to 4060955 Compare August 11, 2023 10:28
@cthulhu-rider cthulhu-rider force-pushed the feature/2453-new-bbcz branch 2 times, most recently from ec75f9d to 7d09562 Compare August 11, 2023 13:12
@cthulhu-rider cthulhu-rider force-pushed the feature/2453-new-bbcz branch 6 times, most recently from 1ac5ee0 to f4e2512 Compare August 11, 2023 15:36
Currently, storage node saves relatively small NeoFS objects in
Blobovnicza tree component: group of BoltDB wrappers managed as a tree.
This component has pretty complex data structure, code implementation
and dubious performance results.

Peapod is a new storage component introduced to replace Blobovnicza one
as more simple and effective. It also bases on single BoltDB instance,
but organizes batch writes in a specific way. In future, Peapod is going
to be used as a storage of small objects by the BlobStor.

Refs nspcc-dev#2453.

Signed-off-by: Leonard Lyubich <[email protected]>
@cthulhu-rider cthulhu-rider force-pushed the feature/2453-new-bbcz branch 2 times, most recently from 0b9556a to 9bba966 Compare August 15, 2023 10:17
Add `cmd/blobovnicza-to-peapod` application which accepts YAML
configuration file of the storage node and, for each configured shard,
overtakes data from Blobovnicza tree to Peapod created in the parent
directory.

The tool is going to be used for phased and safe rejection of the
Blobovnicza trees and the transition to Peapods.

Refs nspcc-dev#2453.

Signed-off-by: Leonard Lyubich <[email protected]>
Error wrapping is normal, so we should always be ready to it.

Signed-off-by: Leonard Lyubich <[email protected]>
Support `peapod` sub-storage type in BlobStor configuration.

Refs nspcc-dev#2453.

Signed-off-by: Leonard Lyubich <[email protected]>
There may be a need to tune time interval b/w batch writes to disk in
Peapod component.

Add storage node's config with `flush_interval` key of type duration
defaulting to 10ms.

Signed-off-by: Leonard Lyubich <[email protected]>
@roman-khimov roman-khimov merged commit 3969928 into nspcc-dev:master Aug 15, 2023
7 of 8 checks passed
roman-khimov added a commit that referenced this pull request Sep 13, 2023
Unbreak peapods:
  2023/09/13 18:50:36 open shard S4wpuCnzpWW7SbwhER3U1v: could not open *blobstor.BlobStor: open substorage peapod: open BoltDB instance: open /storage/peapod0.db: is a directory

Call `stat()` gently instead walking up. FS mount point has to exist there in
any event and we should have some access to it.

The real problem is that #2462 (introducing Peapod) was correct on its own.
And #2495 (introducing capacity) was also correct on its own. But they don't
work together.

Refs 7c54307.
Refs c060b16.

util.MkdirAllX will be removed from code in most of the cases.

Signed-off-by: Roman Khimov <[email protected]>
roman-khimov added a commit that referenced this pull request Sep 14, 2023
Unbreak peapods:
  2023/09/13 18:50:36 open shard S4wpuCnzpWW7SbwhER3U1v: could not open *blobstor.BlobStor: open substorage peapod: open BoltDB instance: open /storage/peapod0.db: is a directory

Call `stat()` gently instead walking up. FS mount point has to exist there in
any event and we should have some access to it.

The real problem is that #2462 (introducing Peapod) was correct on its own.
And #2495 (introducing capacity) was also correct on its own. But they don't
work together.

Refs 7c54307.
Refs c060b16.

util.MkdirAllX will be removed from code in most of the cases.

Signed-off-by: Roman Khimov <[email protected]>
roman-khimov added a commit that referenced this pull request Sep 14, 2023
Unbreak peapods:
  2023/09/13 18:50:36 open shard S4wpuCnzpWW7SbwhER3U1v: could not open *blobstor.BlobStor: open substorage peapod: open BoltDB instance: open /storage/peapod0.db: is a directory

Call `stat()` gently instead walking up. FS mount point has to exist there in
any event and we should have some access to it.

The real problem is that #2462 (introducing Peapod) was correct on its own.
And #2495 (introducing capacity) was also correct on its own. But they don't
work together.

Refs 7c54307.
Refs c060b16.

util.MkdirAllX will be removed from code in most of the cases.

Signed-off-by: Roman Khimov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants