Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
irmin-pack: introduce chunked suffix abstraction #2115
irmin-pack: introduce chunked suffix abstraction #2115
Changes from all commits
afdb743
6ace0bd
47f95af
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is wrong - the
end_poff
is used upstreamed as the end poff of the whole suffix abstraction, but here it returns only the end poff of the last chunk. I should be the sum over all chunks.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up. I think this is worth some discussion.
The way I understand how the suffix's
poff
is currently used in the control file is as a control mechanism to know/verify what data we have written to disk (as is seen in the append only file's consistency check). In this sense, the current code is correct. It tracks the physical offset of the appendable chunk for consistency checks. No other chunk can be changed so for other chunks theirend_poff
is necessarily equivalent to theirIo.size_of_path
(the awkward code inAo_chunk.open_ro
that does exactly this).Summing the values would give the chunked suffix's length and final offset but doesn't really correlate with a physical offset
While working on this I intentionally wanted to keep the existing code working with minimal extra changes but I do think we need to change this. Here is my proposal for a future PR:
suffix_consistency_poff
(or something else -- open to ideas!)end_poff
in chunked file toconsistency_poff
end_poff
from append only'sapiopen api (it would still be an available property)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dispatcher needs to know the end offset for the suffix file in order to verify that a read is within bounds. This does not correspond to a physical offset, indeed. It's "a virtual offset corresponding to the physical end offset, if the suffix was a single file". I'll call it
suffix_end_off
.We can either (I am reiterating what you say):
end_poff
as is, keep the control file as is (tracking theend_poff
of the last chunk), but then we need a function to compute thesuffix_end_off
- this solution works for me, I do find the concept ofsuffix_end_off
a bit weird.end_poff
in the chunked file and the control file with thesuffix_end_off
- I'm not sure if this is what you are proposing for a future PR?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I just looked in the dispatcher and see its usage of
Suffix.end_poff
-- this does need to be updated to be the suffix end offset. Before chunks, this makes sense as anend_poff
, but there is no "end poff" for a chunked suffix. I don't necessarily love adding another virutalized set of offsets, but the suffix's offset range will necessarily be virtualized for chunks since it doesn't map directly to a single range of physical offsets any longer. And it is neatly contained withinChunked_suffix
which helps reduce the scope of offsets to consider.Thanks for pointing out dispatcher's use of the end offset for the suffix. I think the following proposal addresses everything:
end_off
toChunked_suffix
to represent the last offset of the suffix. This can be a sum of the chunks poffs but I don't think tracking it in the control file is needed.Chunked_suffix.end_poff
toconsistency_poff
(or something else -- but I think "end poff" doesn't make sense in the context of a chunked suffix) and also the control file field to reflect the name change. This is the offset we need to track in the control file for consistency checks since we just need to check the appendable chunk's poff upon open. As a part of this, also move the consistency check from append only to chunked suffix (like mentioned previously).I will make the first change in this PR since you rightly pointed out a correctness issue.
The second one can be a future PR since it is mostly a cosmetic name change, but it will also get rid of some of the awkward
end_poff
code (like inAo_chunk.open_ro
) which will be nice. Edit: on second glance, I see that we setlet persisted_end_poff = end_poff
directly in append only so perhaps this needs more consideration (immediate thought is to useIo
for the non-appendable chunks).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does not need to be in this PR necessarily, as there is nothing that breaks here. I included a proposal for this in #2118, you can either review it there or add your own here, as you wish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I missed that. I'll just review in your PR since you have already made the change. 👍