-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for PS sharding in compute #5837
Conversation
0aeba9f
to
59ec475
Compare
2184 tests run: 2099 passed, 0 failed, 85 skipped (full report)Code coverage (full report)
The comment gets automatically updated with the latest test results
8de5b63 at 2023-12-20T07:53:46.499Z :recycle: |
eedc4d1
to
c6ac0a1
Compare
11367b2
to
1e26621
Compare
@knizhnik please can you resolve the merge conflicts. |
8a6655a
to
54a0b6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a single prefetch ring for all shards? The responses might come in different order across the pageservers, so I would assume there to be a separate prefetch ring for each shard. Connections to different pageservers will also be lost and re-established at different times.
prfh_hash *prf_hash; | ||
prfh_hash *prf_hash; | ||
int max_shard_no; | ||
uint8 shard_bitmap[(MAX_SHARDS + 7)/8]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is shard_bitmap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a number of buffered prefetch requests for different shards. Before waiting for response we need two flush all pending requests. And to do it we need to know which shards are effected. This is why I am using this bitmap. Correspondent bit in bitmap is set when we send some request to the shards and cleared by flush.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a single prefetch ring for all shards? The responses might come in different order across the pageservers, so I would assume there to be a separate prefetch ring for each shard. Connections to different pageservers will also be lost and re-established at different times.
No there is still the single prefetch ring for all shards. But each prefetch slot has shard number. When we are waiting for this slot, we wait response from the particular shard. @jcsp said that we can have up to hundred of shards in future. Having separate ring for each shard in this case will be very inefficient.
@knizhnik found a couple more log lines that need to be made shard-aware:
(either in this PR or a followup) |
Fixed |
f994cb1
to
322dd3c
Compare
Co-authored-by: Sasha Krassovsky <[email protected]>
Co-authored-by: Sasha Krassovsky <[email protected]>
9458996
to
b43ae6e
Compare
I don't have any open threads, but I'm not the right person to green tick this -- I think @hlinnaka is the right approver for this. |
@hlinnaka , only you left 🙏 |
This still isn't running CI:
|
The commit to tolerate empty pageserver strings is not making the test pass: it still hits an assertion:
Please use the test branch that I DM'd you earlier today to test your changes. |
I failed to rebase this PR, so I have to replace it with new one: #6205 |
refer #5508 replaces #5837 ## Problem This PR implements sharding support at compute side. Relations are splinted in stripes and `get_page` requests are redirected to the particular shard where stripe is located. All other requests (i.e. get relation or database size) are always send to shard 0. ## Summary of changes Support of sharding at compute side include three things: 1. Make it possible to specify and change in runtime connection to more retain one page server 2. Send `get_page` request to the particular shard (determined by hash of page key) 3. Support multiple servers in prefetch ring requests ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <[email protected]> Co-authored-by: John Spray <[email protected]> Co-authored-by: Heikki Linnakangas <[email protected]>
Problem
This is preliminary implementation of #5508
Summary of changes
Now compute calculates hash of buffer tag and send request to the shard.
Open issues:
It is not consistent with how it is now calculated in Rust.
I think we should change Rust implementation to make in Postgres compatible.
Separate question is how block number should be hashed.
There was discussion about it: #5432 (comment)
So in any case Rust and C hash implementation should remade consistent...
Checklist before requesting a review
Checklist before merging