Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use read lock for compaction #4910

Closed

Conversation

skyzh
Copy link
Member

@skyzh skyzh commented Aug 4, 2023

Problem

Based on the immutable storage state refactor, this is the first optimization we can do.

ref #4766 #4754

Summary of changes

We do not actually need to hold the write lock when removing the file. As long as no thread will be able to read the layer file from the layer map, we can safely remove it without taking any locks. This is done by (1) having a lock that protects that, currently pseudo_lock (2) ensure all read threads are taking the read lock (3) if we want to ensure all threads use the new version of the layer map, try acquire the write lock, and if it succeeds, all read threads are blocked and will switch to the new version.

An optimal solution would be using watermark to track lowest use version when dropping the layer object, but it's still too far so we go with this simple approach for now.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

skyzh added 3 commits August 4, 2023 14:50
Signed-off-by: Alex Chi Z <[email protected]>
Signed-off-by: Alex Chi Z <[email protected]>
@github-actions
Copy link

github-actions bot commented Aug 4, 2023

1040 tests run: 990 passed, 0 failed, 50 skipped (full report)


Flaky tests (1)

Postgres 15

  • test_ondemand_download_failure_to_replace[local_fs]: debug

@skyzh skyzh marked this pull request as ready for review August 4, 2023 19:19
@skyzh skyzh requested review from a team as code owners August 4, 2023 19:19
@skyzh skyzh requested review from awestover and LizardWizzard and removed request for a team August 4, 2023 19:19
@skyzh skyzh self-assigned this Aug 4, 2023
Copy link
Contributor

@awestover awestover left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems good.

Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasnt closely following the whole patch set, so I'm not familiar with previous changes in the stack.

The PR looks good to me, but I'd ask for second pair of eyes from somebody familiar with compaction code cc @koivunej, @bojanserafimov

.unwrap() // unwrap the first level error, which is always Ok.
.await??;
drop(self.pseudo_lock);
// acquire the write lock so that all read threads are blocked, and once this lock is acquired,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concept looks quite similar to rcu family primitives. Ref https://github.com/neondatabase/neon/blob/main/libs/utils/src/simple_rcu.rs

Not asking to use it, just for reference

@skyzh skyzh closed this Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants