-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update FilePrefetchBuffer::Read to reuse file system buffer when possible #13118
Conversation
921c9d9
to
1654309
Compare
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
0cd0a81
to
f4cda31
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@archang19 has updated the pull request. You must reimport the pull request before landing. |
44c51f0
to
28eb03f
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@archang19 has updated the pull request. You must reimport the pull request before landing. |
5d0fbfc
to
5e8b214
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
3 similar comments
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
455495f
to
93f9d12
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
6 similar comments
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
9dbec86
to
f035012
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
f035012
to
cdec626
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
cdec626
to
e2958ae
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
a070928
to
0813984
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
0813984
to
4db4c70
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
4db4c70
to
794a7cd
Compare
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@@ -379,12 +389,21 @@ class FilePrefetchBuffer { | |||
void PrefetchAsyncCallback(FSReadRequest& req, void* cb_arg); | |||
|
|||
void TEST_GetBufferOffsetandSize( | |||
std::vector<std::pair<uint64_t, size_t>>& buffer_info) { | |||
std::vector<std::tuple<uint64_t, size_t, bool>>& buffer_info) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this third field to validate async prefetching, since otherwise I cannot distinguish between async_req_len
and CurrentSize()
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@anand1976 here is a summary of the changes since the last review, since the old comment thread seems to be hidden:
|
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can you add an entry under unreleased_history/performance_improvements
so release history gets update?
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@archang19 merged this pull request in 26b4806. |
Summary
This PR adds support for reusing the file system provided buffer to avoid an extra
memcpy
into RockDB's buffer. This optimization has already been implemented for point lookups, as well as compaction and scan reads when prefetching is disabled.This PR extends this optimization to work with synchronous prefetching (
num_buffers == 1
). Asynchronous prefetching can be addressed in a future PR (and probably should be to keep this PR from growing too large).Remarks
overlap_buf_
(currently used in the async prefetching case) instead of defining a separate buffer. This was discussed in Update FilePrefetchBuffer::Read to reuse file system buffer when possible #13118 (comment).MultiRead
with a single request to take advantage of the file system buffer. This is consistent with previous work (e.g. Provide support for FSBuffer for point lookups #12266).DBIOCorruptionTest.IterReadCorruptionRetry
, since those tests were failing before I addressed a bug in my code for this PR. Run with failed test.Test Plan
I wrote pretty thorough unit tests that cover synchronous prefetching with file system buffer reuse. The flows for partial hits, complete hits, and complete misses are tested. I also parametrized the test to make sure the async prefetching (without file system buffer reuse) still work as expected.
Once we agree on the changes, I will run a long stress test before merging.