-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(walredo): use posix_spawn by moving close_fds() work to walredo C code #6574
feat(walredo): use posix_spawn by moving close_fds() work to walredo C code #6574
Conversation
9ad51a2
to
b524d37
Compare
… code The rust stdlib uses the efficient `posix_spawn` by default. However, before this PR, pageserver used `pre_exec()` in our `close_fds()` ext trait. This PR moves the work that `close_fds()` did to the walredo C code. I verified manually that we're now forking out the walredo process using `posix_spawn`. refs #6565
b524d37
to
36f3d87
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- if this gets us the performance boost without needing further changes, that's a huge win.
Not sure if we need to ping someone who compiles on a Mac to check if some of the syscall stuff needs #ifdef'ing, but that's secondary to getting this change into staging to see the latency change.
I manually tested compilation on my Mac. |
2376 tests run: 2263 passed, 0 failed, 113 skipped (full report)Flaky tests (8)Postgres 16
Postgres 15
Postgres 14Code coverage (full report)
The comment gets automatically updated with the latest test results
36f3d87 at 2024-02-01T20:22:59.365Z :recycle: |
if (err = close_range(3, ~0U, 0)) { | ||
ereport(FATAL, (errcode(ERRCODE_SYSTEM_ERROR), errmsg("seccomp: could not close files >= fd 3"))); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't require seccomp; let's do this even when built without seccomp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I (ab)used the HAVE_SECCOMP as a proxy for that we're on Linux.
Proximity to the seccomp code makes sense IMO because both (and only both) are the essential security building block for walredo.
I'd appreciate if you could provide a follow-up fix that
- adds a configure check for close_range, so we use the syscall wrapper if it's available in glibc, otherwise fall back to my wrapper & fail to compile if NR_close_range is not defined
- implements a "status report" walredo command as I suggested on Slack
Also, I'd like to suggest we add additional robustness checks to walredo, e.g., a message where PS can ask a freshly spawned walredo whether it thinks it correctly set up seccomp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'd like to suggest we add additional robustness checks to walredo, e.g., a message where PS can ask a freshly spawned walredo whether it thinks it correctly set up seccomp.
Is that the "status report" command you meant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #6580 to track this
To speed this up further, the pageserver could keep a small pool of pre-spawned WAL redo processes. As soon as you use a WAL redo process on a specific tenant/timeline, it must be considered as "tainted" with that tenant, and mustn't be used to replay WAL belonging to any other tenant. But until that first use, there's nothing tenant-specific about them. |
Yeah, that's what I've been proposing in the slack thread, I'm working on it. |
The rust stdlib uses the efficient
posix_spawn
by default.However, before this PR, pageserver used
pre_exec()
in ourclose_fds()
ext trait.This PR moves the work that
close_fds()
did to the walredo C code.I verified manually using
gdb
that we're now forking out the walredo process usingposix_spawn
.refs #6565