storcon: automatically clear Pause/Stop scheduling policies to enable detaches #10011

jcsp · 2024-12-04T14:24:44Z

Problem

We saw a tenant get stuck when it had been put into Pause scheduling mode to pin it to a pageserver, then it was left idle for a while and the control plane tried to detach it.

Close: #9957

Summary of changes

When changing policy to Detached or Secondary, set the scheduling policy to Active.
Add a test that exercises this
When persisting tenant shards, set their generation_pageserver to null if the placement policy is not Attached (this enables consistency checks to work, and avoids leaving state in the DB that could be confusing/misleading in future)

github-actions · 2024-12-04T17:23:54Z

7051 tests run: 6736 passed, 0 failed, 315 skipped (full report)

Flaky tests (8)

Postgres 17

test_pageserver_gc_compaction_smoke: release-arm64

Postgres 16

test_isolation[None]: release-x86-64
test_timeline_archive[4]: release-arm64

Postgres 15

test_physical_replication_config_mismatch_too_many_known_xids: release-arm64
test_pull_timeline[True]: release-arm64

Postgres 14

test_pg_regress[4]: release-x86-64
test_prefetch[4]: release-x86-64
test_pull_timeline[True]: release-x86-64

Code coverage* (full report)

functions: 31.4% (8329 of 26510 functions)
lines: 47.8% (65507 of 137138 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
294a941 at 2024-12-06T14:45:06.167Z :recycle:}

Closes: #9957

VladLazar

Code itself look fine.

Is this the behaviour we want though? We lose the shard scheduling policy whenever cplane detaches (e.g. idle tenant detach). This is also confusing.

Would this alternative work?
When detaching or downgrading to secondary due to external location conf, the storage controller proxies it to the pageserver irrespective of the shard policy. Pageserver handles it and we preserved the policy on the storcon.

jcsp · 2024-12-05T17:43:11Z

Is this the behaviour we want though? We lose the shard scheduling policy whenever cplane detaches (e.g. idle tenant detach). This is also confusing.

In common cases, yes... this is perhaps a symptom of the scheduling policies being a bit over-general as a concept. What we're using them for in practice is to use Pause to pin a tenant to a pageserver, and once we are detached, it usually makes sense to forget that pin.

When detaching or downgrading to secondary due to external location conf, the storage controller proxies it to the pageserver irrespective of the shard policy. Pageserver handles it and we preserved the policy on the storcon.

This could be fragile: it's okay if the pageserver is available, or if the pageserver is marked offline (because when it becomes available it'll get true'd up), but if the pageserver is unavailable but not yet marked offline, then it would prevent us changing the configuration of a tenant. That's arguably tolerable, but it sort of breaks the model that we can change configs in the foreground and reconcile in the background.

Then again, in the pure reconciliation loop model, it's correct to simply not detach a tenant if someone sets its placement policy to detach it but it scheduling policy is paused. For me this just points to the scheduling policies being defined in a way that doesn't quite fit how we want to use them. Maybe it should be like an pin: Option<NodeId> instead and then it intuitively makes sense that it gets cleared when you detach.

I do generally agree that there's a bit of "code smell" here... but fudging the scheduling policy is probably a less bad smell than changing the config functions to do pageserver work in the foreground instead of going via reconciliation.

…ach-paused

VladLazar · 2024-12-05T19:08:08Z

Is this the behaviour we want though? We lose the shard scheduling policy whenever cplane detaches (e.g. idle tenant detach). This is also confusing.

In common cases, yes... this is perhaps a symptom of the scheduling policies being a bit over-general as a concept. What we're using them for in practice is to use Pause to pin a tenant to a pageserver, and once we are detached, it usually makes sense to forget that pin.

Oke, that's fair. Could you please update the comment for the schedulling policies to mention when they get reset?

but if the pageserver is unavailable but not yet marked offline, then it would prevent us changing the configuration of a tenant

I don't see it. We update the db and in-memory state and will reconcile when the pageserver comes back online.

jcsp · 2024-12-06T13:39:06Z

Added comment about usage of ShardSchedulingPolicy in 294a941

but if the pageserver is unavailable but not yet marked offline, then it would prevent us changing the configuration of a tenant

I don't see it. We update the db and in-memory state and will reconcile when the pageserver comes back online.

But the reconcile wouldn't do anything if scheduling mode was still set to Pause (i.e. for that to work we'd have to also have the change in this PR), thinking specifically of the case of a brief unavailability where we don't do a whole-node reconcile.

VladLazar · 2024-12-06T18:59:09Z

Added comment about usage of ShardSchedulingPolicy in 294a941

but if the pageserver is unavailable but not yet marked offline, then it would prevent us changing the configuration of a tenant

I don't see it. We update the db and in-memory state and will reconcile when the pageserver comes back online.

But the reconcile wouldn't do anything if scheduling mode was still set to Pause (i.e. for that to work we'd have to also have the change in this PR), thinking specifically of the case of a brief unavailability where we don't do a whole-node reconcile.

Right, special casing it was implied it for detach and downgrade to secondary was implied in my proposal.

VladLazar

I'm fine with this. I was trying to explore alternatives in the PR thread, but we don't have to block this PR.

jcsp added t/bug Issue Type: Bug c/storage/controller Component: Storage Controller labels Dec 4, 2024

jcsp requested a review from VladLazar December 4, 2024 14:34

jcsp marked this pull request as ready for review December 4, 2024 14:34

jcsp requested a review from a team as a code owner December 4, 2024 14:34

jcsp added 3 commits December 4, 2024 18:40

storcon: clear generation_pageserver when detaching

60ed9fa

storcon: set scheduling policy Active when detaching

3a3445b

Closes: #9957

tests: add test_storage_controller_detach_stopped

d97286e

jcsp force-pushed the jcsp/issue-9957-detach-paused branch from 519c449 to d97286e Compare December 4, 2024 18:42

VladLazar reviewed Dec 5, 2024

View reviewed changes

Merge remote-tracking branch 'upstream/main' into jcsp/issue-9957-det…

4f3a1ab

…ach-paused

libs: add explanatory comment to ShardSchedulingPolicy

294a941

VladLazar approved these changes Dec 6, 2024

View reviewed changes

jcsp added this pull request to the merge queue Dec 7, 2024

Merged via the queue into main with commit ec79087 Dec 7, 2024
82 checks passed

jcsp deleted the jcsp/issue-9957-detach-paused branch December 7, 2024 13:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storcon: automatically clear Pause/Stop scheduling policies to enable detaches #10011

storcon: automatically clear Pause/Stop scheduling policies to enable detaches #10011

jcsp commented Dec 4, 2024 •

edited

Loading

github-actions bot commented Dec 4, 2024 •

edited

Loading

Postgres 17

Postgres 16

Postgres 15

Postgres 14

VladLazar left a comment •

edited

Loading

jcsp commented Dec 5, 2024 •

edited

Loading

VladLazar commented Dec 5, 2024

jcsp commented Dec 6, 2024

VladLazar commented Dec 6, 2024

VladLazar left a comment

storcon: automatically clear Pause/Stop scheduling policies to enable detaches #10011

storcon: automatically clear Pause/Stop scheduling policies to enable detaches #10011

Conversation

jcsp commented Dec 4, 2024 • edited Loading

Problem

Summary of changes

github-actions bot commented Dec 4, 2024 • edited Loading

7051 tests run: 6736 passed, 0 failed, 315 skipped (full report)

Postgres 17

Postgres 16

Postgres 15

Postgres 14

Code coverage* (full report)

VladLazar left a comment • edited Loading

Choose a reason for hiding this comment

jcsp commented Dec 5, 2024 • edited Loading

VladLazar commented Dec 5, 2024

jcsp commented Dec 6, 2024

VladLazar commented Dec 6, 2024

VladLazar left a comment

Choose a reason for hiding this comment

jcsp commented Dec 4, 2024 •

edited

Loading

github-actions bot commented Dec 4, 2024 •

edited

Loading

VladLazar left a comment •

edited

Loading

jcsp commented Dec 5, 2024 •

edited

Loading