-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storcon: automatically clear Pause/Stop scheduling policies to enable detaches #10011
Conversation
7051 tests run: 6736 passed, 0 failed, 315 skipped (full report)Flaky tests (8)Postgres 17
Postgres 16
Postgres 15
Postgres 14
Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
294a941 at 2024-12-06T14:45:06.167Z :recycle: |
519c449
to
d97286e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code itself look fine.
Is this the behaviour we want though? We lose the shard scheduling policy whenever cplane detaches (e.g. idle tenant detach). This is also confusing.
Would this alternative work?
When detaching or downgrading to secondary due to external location conf, the storage controller proxies it to the pageserver irrespective of the shard policy. Pageserver handles it and we preserved the policy on the storcon.
In common cases, yes... this is perhaps a symptom of the scheduling policies being a bit over-general as a concept. What we're using them for in practice is to use Pause to pin a tenant to a pageserver, and once we are detached, it usually makes sense to forget that pin.
This could be fragile: it's okay if the pageserver is available, or if the pageserver is marked offline (because when it becomes available it'll get true'd up), but if the pageserver is unavailable but not yet marked offline, then it would prevent us changing the configuration of a tenant. That's arguably tolerable, but it sort of breaks the model that we can change configs in the foreground and reconcile in the background. Then again, in the pure reconciliation loop model, it's correct to simply not detach a tenant if someone sets its placement policy to detach it but it scheduling policy is paused. For me this just points to the scheduling policies being defined in a way that doesn't quite fit how we want to use them. Maybe it should be like an I do generally agree that there's a bit of "code smell" here... but fudging the scheduling policy is probably a less bad smell than changing the config functions to do pageserver work in the foreground instead of going via reconciliation. |
Oke, that's fair. Could you please update the comment for the schedulling policies to mention when they get reset?
I don't see it. We update the db and in-memory state and will reconcile when the pageserver comes back online. |
Added comment about usage of ShardSchedulingPolicy in 294a941
But the reconcile wouldn't do anything if scheduling mode was still set to Pause (i.e. for that to work we'd have to also have the change in this PR), thinking specifically of the case of a brief unavailability where we don't do a whole-node reconcile. |
Right, special casing it was implied it for detach and downgrade to secondary was implied in my proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this. I was trying to explore alternatives in the PR thread, but we don't have to block this PR.
Problem
We saw a tenant get stuck when it had been put into Pause scheduling mode to pin it to a pageserver, then it was left idle for a while and the control plane tried to detach it.
Close: #9957
Summary of changes
generation_pageserver
to null if the placement policy is not Attached (this enables consistency checks to work, and avoids leaving state in the DB that could be confusing/misleading in future)