-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pageserver: assert that keys belong to shard #9943
Conversation
7066 tests run: 6757 passed, 0 failed, 309 skipped (full report)Flaky tests (7)Postgres 17
Postgres 16
Postgres 15
Postgres 14
Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
c340914 at 2024-12-06T10:34:36.698Z :recycle: |
This keeps timing out on the v17 debug without-lfc regression tests. I wouldn't expect the added assertions here to have a significant performance impact, but I'll look into it. |
Idk, this barely shows up on CPU profiles -- On a successful run over on #10000, this job completed in 40 minutes. Here, it times out after an hour. Maybe the test fails in some other way, e.g. a node crash that manifests as a timeout. Having a look at the test artifacts. |
I'm not seeing any other issues except spurious timeouts, and I don't see how this change would be expensive enough to increase runtime from 40 to 60 minutes. Rerunning. |
I don't know why this wasn't included in any of the CI artifacts, but I found the issue with
The test just hangs at this point, since the compute is waiting for the pageserver. Let's see what this is. |
It's an FSM key:
Forking off #10027. |
We've seen cases where stray keys end up on the wrong shard. This shouldn't happen. Add debug assertions to prevent this. In release builds, we should be lenient in order to handle changing key ownership policies.
Touches #9914.