-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
irmin-pack: modify auto flush callback behavior #2088
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2088 +/- ##
==========================================
+ Coverage 64.73% 64.76% +0.03%
==========================================
Files 131 132 +1
Lines 15658 15658
==========================================
+ Hits 10136 10141 +5
+ Misses 5522 5517 -5
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Your PR changes the expected behavior which is that the dict should be flushed before the suffix. https://github.com/mirage/irmin/blob/main/src/irmin-pack/unix/file_manager.ml#L179 All these file synchronisation procedures are very sensitive. This is the reason why I've gathered them in the file_manager so that the reader of the code doesn't have to guess which callback mutates what. A way forward could be to rename |
419ec01
to
fef11e7
Compare
@Ngoguey42 ah, thanks for that insight. It gave me an idea for how to refine the flushing behavior to (I think) clarify intentions of different ways append only files are used. |
type flush_callback = | ||
[ `Auto of (unit, Io.write_error) result -> unit | `Manual of t -> unit ] | ||
(** [flush_callback] is the callback when a flush threshold has been reached. | ||
|
||
- Use [`Auto] callbacks if you only care about the result of a flush. The | ||
buffer is flushed automatically before calling the callback with the | ||
result. | ||
- Use [`Manual] callbacks if you need more control over when the flush | ||
occurs, for instance if you want to coordinate with flushing other | ||
files. The buffer is not automatically flushed before calling the | ||
callback but flush is expected to be called by the callback. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need a callback for Auto
? Couldn't the error just be raised by append_only.ml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original motivation for this PR was my previous comment to not require flush to be called in "auto" flush callbacks. The naming doesn't make sense to me but perhaps it is just a historical remnant (looking at Io_legacy
and config) that we deal with. I guess we can keep everything manually automatic (automatically manual?).
This is an attempt to clarify the two ways the flush threshold callback (maybe needs a better name) are used in the code, but happy to close this PR and keep code as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that solving this conundrum requires reserving the word auto
to the auto/explicit distinction.
What about this?
type auto_flush_procedure = [ `Internal | `External of t -> unit ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or
type auto_flush_procedure = [ `Default | `Custom of t -> unit ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I lean towards Internal/External
. Internal
still needs a function associated if we want to raise exceptions on error.
The auto prefix here is still weird/confusing to me, but I think I understand more where you are coming from now. It is more of a threshold_flush
than an auto_flush
in my mind but threshold_flush_procedure
is quite the name 🙃.
(as an aside, and of no real importance, the typical english distinction is auto/manual or implicit/explicit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a misunderstanding somewhere 🤔
On main, auto flush
refers to flushes automatically triggered by the file stack code because a write buffer is full while flush
(no prefix) is a call to File_manager.flush
at the end of batch.
Also, wouldn't it be possible for append_only
to call Errs.raise_if_error
in the Auto aka Internal aka Default case? This was there is no need to pass a callback to raise the error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make it completely auto/internal/default/magic if we add Errs
to the functor.
Re: auto, this seems more of a concern of file manager than append only but I'm fine to leave it. ☑️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took another pass at this based on our discussion.
fef11e7
to
5c43689
Compare
Add support for two types of flush behavior when the flush threshold is reached: automatic internal flushes and manually controlled external flushes.
5c43689
to
3b63db4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Instead of expecting the caller to flush, it now sends the results of a flush to the callback.Part of #2039This PR updates the behavior of flushing for append only files by letting an initializer to choose between automatic flushing and manually controlled flushing.