Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pageserver): consider partial compaction layer map in layer check #10044

Merged
merged 4 commits into from
Dec 19, 2024

Conversation

skyzh
Copy link
Member

@skyzh skyzh commented Dec 6, 2024

Problem

In #9897 we temporarily disabled the layer valid check because the current one only considers the end result of all compaction algorithms, but partial gc-compaction would temporarily produce an "invalid" layer map.

part of #9114

Summary of changes

Allow LSN splits to overlap in the slow path check. Currently, the valid check is only used in storage scrubber (background job) and during gc-compaction (without taking layer lock). Therefore, it's fine for such checks to be a little bit inefficient but more accurate.

@skyzh skyzh requested review from problame and arpad-m December 6, 2024 21:11
@skyzh skyzh marked this pull request as ready for review December 6, 2024 21:11
@skyzh skyzh requested a review from a team as a code owner December 6, 2024 21:11
Copy link

github-actions bot commented Dec 6, 2024

7095 tests run: 6798 passed, 0 failed, 297 skipped (full report)


Flaky tests (1)

Postgres 16

  • test_physical_replication_config_mismatch_max_locks_per_transaction: release-x86-64

Code coverage* (full report)

  • functions: 31.3% (8398 of 26866 functions)
  • lines: 48.0% (66671 of 138978 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
90c5649 at 2024-12-18T22:21:37.718Z :recycle:

Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we run this check also at Step 3, while we hold the layer map lock, after updating layer map in finish_gc_compaction?

(And ideally roll back the change, or panic the tenant?)

pageserver/src/tenant/checks.rs Outdated Show resolved Hide resolved
@skyzh skyzh requested a review from a team as a code owner December 16, 2024 18:35
@skyzh skyzh requested a review from kelvich December 16, 2024 18:35
@skyzh skyzh force-pushed the skyzh/layer-check-slow-path branch from 3a29119 to a16a91f Compare December 16, 2024 18:35
@skyzh
Copy link
Member Author

skyzh commented Dec 16, 2024

a16a91f adds a final check for the layer map before updating the compaction result back to the layer map.

@skyzh skyzh requested review from problame and removed request for kelvich and a team December 16, 2024 18:36
pageserver/src/tenant/checks.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/checks.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/checks.rs Outdated Show resolved Hide resolved
@skyzh skyzh force-pushed the skyzh/layer-check-slow-path branch from a16a91f to 90c5649 Compare December 18, 2024 21:23
@skyzh skyzh added this pull request to the merge queue Dec 19, 2024
Merged via the queue into main with commit b89e02f Dec 19, 2024
82 checks passed
@skyzh skyzh deleted the skyzh/layer-check-slow-path branch December 19, 2024 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants