-
Notifications
You must be signed in to change notification settings - Fork 456
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
tokio-epoll-uring: use it on the layer-creating code paths (#6378)
part of #6663 See that epic for more context & related commits. Problem ------- Before this PR, the layer-file-creating code paths were using VirtualFile, but under the hood these were still blocking system calls. Generally this meant we'd stall the executor thread, unless the caller "knew" and used the following pattern instead: ``` spawn_blocking(|| { Handle::block_on(async { VirtualFile::....().await; }) }).await ``` Solution -------- This PR adopts `tokio-epoll-uring` on the layer-file-creating code paths in pageserver. Note that on-demand downloads still use `tokio::fs`, these will be converted in a future PR. Design: Avoiding Regressions With `std-fs` ------------------------------------------ If we make the VirtualFile write path truly async using `tokio-epoll-uring`, should we then remove the `spawn_blocking` + `Handle::block_on` usage upstack in the same commit? No, because if we’re still using the `std-fs` io engine, we’d then block the executor in those places where previously we were protecting us from that through the `spawn_blocking` . So, if we want to see benefits from `tokio-epoll-uring` on the write path while also preserving the ability to switch between `tokio-epoll-uring` and `std-fs` , where `std-fs` will behave identical to what we have now, we need to ***conditionally* use `spawn_blocking + Handle::block_on`** . I.e., in the places where we use that know, we’ll need to make that conditional based on the currently configured io engine. It boils down to investigating all the places where we do `spawn_blocking(... block_on(... VirtualFile::...))`. Detailed [write-up of that investigation in Notion](https://neondatabase.notion.site/Surveying-VirtualFile-write-path-usage-wrt-tokio-epoll-uring-integration-spawn_blocking-Handle-bl-5dc2270dbb764db7b2e60803f375e015?pvs=4 ), made publicly accessible. tl;dr: Preceding PRs addressed the relevant call sites: - `metadata` file: turns out we could simply remove it (#6777, #6769, #6775) - `create_delta_layer()`: made sensitive to `virtual_file_io_engine` in #6986 NB: once we are switched over to `tokio-epoll-uring` everywhere in production, we can deprecate `std-fs`; to keep macOS support, we can use `tokio::fs` instead. That will remove this whole headache. Code Changes In This PR ----------------------- - VirtualFile API changes - `VirtualFile::write_at` - implement an `ioengine` operation and switch `VirtualFile::write_at` to it - `VirtualFile::metadata()` - curiously, we only use it from the layer writers' `finish()` methods - introduce a wrapper `Metadata` enum because `std::fs::Metadata` cannot be constructed by code outside rust std - `VirtualFile::sync_all()` and for completeness sake, add `VirtualFile::sync_data()` Testing & Rollout ----------------- Before merging this PR, we ran the CI with both io engines. Additionally, the changes will soak in staging. We could have a feature gate / add a new io engine `tokio-epoll-uring-write-path` to do a gradual rollout. However, that's not part of this PR. Future Work ----------- There's still some use of `std::fs` and/or `tokio::fs` for directory namespace operations, e.g. `std::fs::rename`. We're not addressing those in this PR, as we'll need to add the support in tokio-epoll-uring first. Note that rename itself is usually fast if the directory is in the kernel dentry cache, and only the fsync after rename is slow. These fsyncs are using tokio-epoll-uring, so, the impact should be small.
- Loading branch information
Showing
6 changed files
with
246 additions
and
78 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
3da410c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2561 tests run: 2427 passed, 1 failed, 133 skipped (full report)
Failures on Postgres 14
test_basebackup_with_high_slru_count[github-actions-selfhosted-sequential-10-13-30]
: releaseCode coverage* (full report)
functions
:28.8% (6958 of 24199 functions)
lines
:47.2% (42603 of 90201 lines)
* collected from Rust tests only
3da410c at 2024-03-05T10:02:08.423Z :recycle: