Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EphemeralFile::write_blob: behavior in case of errors is unspecified / broken / unretyable #5007

Open
problame opened this issue Aug 16, 2023 · 3 comments
Assignees
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver

Comments

@problame
Copy link
Contributor

Here's my list of known problems:

  • what happens if we retry after an error?
    • think about the scenario where write_blob needs to write 3 pages.
    • there are 3 opportunities for failure
    • if we fail and bail on the second opportunity, then we've written to the file on disk, and the page cache contains the changes, but, at least size isn't updated
    • a subsequent read_blob will return half-written data (the unwritten bytes will be zero-filled)
    • a subsequent write_blob will overwrite the written data

I was toying with the idea of bumping size in the if self.off == PAGE_SZ.
But that makes the retry situation even worse, because now the subsequent write_blob doesn't overwrite, but instead it appends.

I propose to create an issue referencing this comment here, and fix it in a future PR.

Originally posted by @problame in #5004 (comment)

@problame problame added c/storage/pageserver Component: storage: pageserver a/tech_debt Area: related to tech debt labels Aug 16, 2023
@problame problame self-assigned this Aug 16, 2023
@problame
Copy link
Contributor Author

When fixing this, one approach might be to increment self.size eagerly.
When going this route, reconsider whether we need blknum and pos as function-global mutable state, vs just computing them every time using self.size.

If we can just re-compute it every time, consider eliminating the Writer type introduced by #5004

@arpad-m
Copy link
Member

arpad-m commented Aug 16, 2023

When going this route, reconsider whether we need blknum and pos as function-global mutable state, vs just computing them every time using self.size.

We might be able to compute it every time as LLVM makes just a & and a >> out of it: #5004 (comment)

@problame
Copy link
Contributor Author

Also (copy-pasting from #4994 (comment) )


Regarding the

std::io::Error::new(
                    ErrorKind::Other,
                    format!(

The current callers just convert into anyhow::Error anyway, but, I don't want to change write_blob and read_blob to anyhow::Result to avoid conflicts with @arpad-m 's #5015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

No branches or pull requests

2 participants