-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make VirtualFile::{open, open_with_options, create,sync_all,with_file} async fn #5224
Conversation
1640 tests run: 1566 passed, 0 failed, 74 skipped (full report)Code coverage (full report)
The comment gets automatically updated with the latest test results
eb2dd71 at 2023-09-07T22:36:49.297Z :recycle: |
557544c
to
8fcbeb5
Compare
8fcbeb5
to
f64a2d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was confused that even though this marks open_with_options
as async, it still uses the synchronous std::fs::OpenOptions::open() under the hood. But I guess that was intentional; all the read and write functions are similarly using synchronous functions under the hood. I gather the plan is to change the interface first, to force all the callers be async, and switch the implementation later. Please explain that in the commit message, so that future code historians won't be confused.
Yeah Christian requested it in #4743:
I will add an explanation tomorrow to the commit msg. |
Hmmm seems I'm running into async closure lifetime issues (they are well known):
Maybe one could use dyn futures, given that it is tests (and also, in there it's just done once for each file)? |
## Problem For #4743, we want to convert everything up to the actual I/O operations of `VirtualFile` to `async fn`. ## Summary of changes This PR is the last change in a series of changes to `VirtualFile`: #5189, #5190, #5195, #5203, and #5224. It does the last preparations before the I/O operations are actually made async. We are doing the following things: * First, we change the locks for the file descriptor cache to tokio's locks that support Send. This is important when one wants to hold locks across await points (which we want to do), otherwise the Future won't be Send. Also, one shouldn't generally block in async code as executors don't like that. * Due to the lock change, we now take an approach for the `VirtualFile` destructors similar to the one proposed by #5122 for the page cache, to use `try_write`. Similarly to the situation in the linked PR, one can make an argument that if we are in the destructor and the slot has not been reused yet, we are the only user accessing the slot due to owning the lock mutably. It is still possible that we are not obtaining the lock, but the only cause for that is the clock algorithm touching the slot, which should be quite an unlikely occurence. For the instance of `try_write` failing, we spawn an async task to destroy the lock. As just argued however, most of the time the code path where we spawn the task should not be visited. * Lastly, we split `with_file` into a macro part, and a function part that contains most of the logic. The function part returns a lock object, that the macro uses. The macro exists to perform the operation in a more compact fashion, saving code from putting the lock into a variable and then doing the operation while measuring the time to run it. We take the locks approach because Rust has no support for async closures. One can make normal closures return a future, but that approach gets into lifetime issues the moment you want to pass data to these closures via parameters that has a lifetime (captures work). For details, see [this](https://smallcultfollowing.com/babysteps/blog/2023/03/29/thoughts-on-async-closures/) and [this](https://users.rust-lang.org/t/function-that-takes-an-async-closure/61663) link. In #5224, we ran into a similar problem with the `test_files` function, and we ended up passing the path and the `OpenOptions` by-value instead of by-ref, at the expense of a few extra copies. This can be done as the data is cheaply copyable, and we are in test code. But here, we are not, and while `File::try_clone` exists, it [issues system calls internally](https://github.com/rust-lang/rust/blob/1e746d7741d44551e9378daf13b8797322aa0b74/library/std/src/os/fd/owned.rs#L94-L111). Also, it would allocate an entirely new file descriptor, something that the fd cache was built to prevent. * We change the `STORAGE_IO_TIME` metrics to support async. Part of #4743.
Problem
Once we use async file system APIs for
VirtualFile
, these functions will also need to be async fn.Summary of changes
Makes the functions
open, open_with_options, create,sync_all,with_file
ofVirtualFile
async fn, including all functions that call it. Like in the prior PRs, the actual I/O operations are not using async APIs yet, as per request in the #4743 epic.We switch towards not using
VirtualFile
in the par_fsync module, hopefully this is only temporary until we can actually do fully async I/O inVirtualFile
. This might cause us to exhaust fd limits in the tests, but it should only be an issue for the local developer as we have high ulimits in prod.This PR is a follow-up of #5189, #5190, #5195, and #5203. Part of #4743.
Checklist before requesting a review
Checklist before merging