-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: improve the serde impl for several types(Lsn
, TenantId
, TimelineId
...)
#5335
feat: improve the serde impl for several types(Lsn
, TenantId
, TimelineId
...)
#5335
Conversation
Lsn
, TenantId
, TimelineId
...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
I would like to remind about the bincode examples. These are actually quite tricky to find it seems, because we have a module bin_ser.rs
with trait BeSer
to provide the actual helpers, or one the helper ser
which is used.
Perhaps the timelinemetadata could have a new test case which just constructs a new TimelineMetadataV2 using non-zero Lsns (could be like 0x12345678
) and serializes it to bytes, and then deserializes equal value out of it? In pageserver/src/tenant/metadata.rs
.
Um, I am a bit confused. |
I think I figured out why we need In addition, I added the custom serde implementation for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to post these suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now this PR changes the binary serialization of Id
, TimelineId
and TenantId
. This is because the custom visitor implementation works differently than how serde derived worked (with named newtype structs). This comes up as test failures, postgres hangs.
The tests written right now are written to support the current implementation and not the implementation which existed before the visitor implementation. The version from main
needs to be used to form those expected_bytes
in order to make sure the bytes do not change in the process.
A one possible path forwards to getting this PR to completion is to split the commits into:
- tests
- non-humanreadable (bincode) tests are good as is, expected bytes are just different than what the main produces
- humanreadable tests will need to go through a wrapper which does the same as
serde_with::DisplayFromStr
(which is very close to what the current implementation does)
- added code
The tests must first pass without serialization additions, and then we can switch to this implementation.
Hmm, I think there is small problem here. neon/safekeeper/src/json_ctrl.rs Lines 34 to 51 in b91ac67
What do you think? @koivunej |
This was the original goal, and I think it is still worth while in hopes of getting less duplication in the codebase and less proc macro invocations.
Correct. I was earlier this week trying to understand the problem in test_sync_safekeepers, runnable as:
I've pushed my WIP branch at https://github.com/neondatabase/neon/tree/joonas/improve-serde-lsn-id. It has a workaround for the Interesting, I must've fumbled something else then in my branch. Apologies. I'll push the above mentioned commits and c4bb139 to your branch and run all tests. |
2358 tests run: 2243 passed, 0 failed, 115 skipped (full report)Code coverage (full report)
The comment gets automatically updated with the latest test results
cdbf341 at 2023-11-04T22:40:35.581Z :recycle: |
I suspect a safekeeper control file is being serialized differently. It might be a good idea to add a new serialization test case (from safekeeper logs):
I figured out the control file issue from compute logs:
And I originally figured out something was wrong with compute because the test failure:
|
@koivunej I done the testing of Hence, I would like you to re-review the PR, specifically the above commit. |
@koivunej |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for this PR, it was an unnecessary long journey because of me being busy with other work.
…5960) Precursor for #5957 ## Problem When DeletionList was written, TenantId/TimelineId didn't have human-friendly modes in their serde. #5335 added those, such that the helpers used in serialization of HashMaps are no longer necessary. ## Summary of changes - Add a unit test to ensure that this change isn't changing anything about the serialized form - Remove the serialization helpers for maps of Id
Once upon a time, we used to have duplicated types for runtime IndexPart and whatever we stored. Because of the serde fixes in #5335 we have no need for duplicated IndexPart type anymore, but the `IndexLayerMetadata` stayed. - remove the type - remove LayerFileMetadata::file_size() in favor of direct field access Split off from #7833. Cc: #3072.
…lease (#7693) ## Problem Currently we serialize the `TimelineMetadata` into bytes to put it into `index_part.json`. This `Vec<u8>` (hopefully `[u8; 512]`) representation was chosen because of problems serializing TimelineId and Lsn between different serializers (bincode, json). After #5335, the serialization of those types became serialization format aware or format agnostic. We've removed the pageserver local `metadata` file writing in #6769. ## Summary of changes Allow switching from the current serialization format to plain JSON for the legacy TimelineMetadata format in the future by adding a competitive serialization method to the current one (`crate::tenant::metadata::modern_serde`), which accepts both old bytes and new plain JSON. The benefits of this are that dumping the index_part.json with pretty printing no longer produces more than 500 lines of output, but after enabling it produces lines only proportional to the layer count, like: ```json { "version": ???, "layer_metadata": { ... }, "disk_consistent_lsn": "0/15FD5D8", "legacy_metadata": { "disk_consistent_lsn": "0/15FD5D8", "prev_record_lsn": "0/15FD5A0", "ancestor_timeline": null, "ancestor_lsn": "0/0", "latest_gc_cutoff_lsn": "0/149FD18", "initdb_lsn": "0/149FD18", "pg_version": 15 } } ``` In the future, I propose we completely stop using this legacy metadata type and wasting time trying to come up with another version numbering scheme in addition to the informative-only one already found in `index_part.json`, and go ahead with storing metadata or feature flags on the `index_part.json` itself. #7699 is the "one release after" changes which starts to produce metadata in the index_part.json as json.
Improve the serde impl for several types (
Lsn
,TenantId
,TimelineId
) by making them sensitive toSerializer::is_human_readadable
(true for json, false for bincode).Fixes #3511 by:
Lsn
Id
serde_as_u64
inlibs/utils/src/lsn.rs
#[serde_as(as = "DisplayFromStr")]
in all possible structsAdditionally some safekeeper types gained serde tests.