Add support for PS sharding in compute #5837

knizhnik · 2023-11-09T06:18:48Z

Problem

This is preliminary implementation of #5508

Summary of changes

Now compute calculates hash of buffer tag and send request to the shard.

Open issues:

I am currently implemented hash in this way:

#if PG_MAJORVERSION_NUM < 16
	hash = murmurhash32(tag->rnode.spcNode);
	hash_combine(hash, murmurhash32(tag->rnode.dbNode));
	hash_combine(hash, murmurhash32(tag->rnode.relNode));
	hash_combine(hash, murmurhash32(tag->blockNum/STRIPE_SIZE));
#else
	hash = murmurhash32(tag->spcOid);
	hash_combine(hash, murmurhash32(tag->dbOid));
	hash_combine(hash, murmurhash32(tag->relNumber));
	hash_combine(hash, murmurhash32(tag->blockNum/STRIPE_SIZE));
#endif

It is not consistent with how it is now calculated in Rust.
I think we should change Rust implementation to make in Postgres compatible.

Separate question is how block number should be hashed.
There was discussion about it: #5432 (comment)

So in any case Rust and C hash implementation should remade consistent...

Stripe size is currently hardcoded, should it be taken from control plane?

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

Do not forget to reformat commit message to not include the above checklist

pgxn/neon/pagestore_client.h

pgxn/neon/libpagestore.c

pgxn/neon/control_plane_connector.c

github-actions · 2023-11-09T18:16:17Z

2184 tests run: 2099 passed, 0 failed, 85 skipped (full report)

Flaky tests (3)

Postgres 16

test_lfc_resize: debug

Postgres 15

test_statvfs_pressure_usage: debug
test_delete_timeline_client_hangup: release

Code coverage (full report)

functions: 55.0% (9711 of 17672 functions)
lines: 82.2% (55813 of 67887 lines)

_{The comment gets automatically updated with the latest test results
8de5b63 at 2023-12-20T07:53:46.499Z :recycle:}

pgxn/neon/libpagestore.c

jcsp · 2023-12-08T11:51:58Z

@knizhnik please can you resolve the merge conflicts.

hlinnaka

Is there a single prefetch ring for all shards? The responses might come in different order across the pageservers, so I would assume there to be a separate prefetch ring for each shard. Connections to different pageservers will also be lost and re-established at different times.

hlinnaka · 2023-12-09T10:49:20Z

pgxn/neon/pagestore_smgr.c

-	prfh_hash  *prf_hash;
+	prfh_hash	*prf_hash;
+	int			max_shard_no;
+	uint8		shard_bitmap[(MAX_SHARDS + 7)/8];


What is shard_bitmap?

We have a number of buffered prefetch requests for different shards. Before waiting for response we need two flush all pending requests. And to do it we need to know which shards are effected. This is why I am using this bitmap. Correspondent bit in bitmap is set when we send some request to the shards and cleared by flush.

Is there a single prefetch ring for all shards? The responses might come in different order across the pageservers, so I would assume there to be a separate prefetch ring for each shard. Connections to different pageservers will also be lost and re-established at different times.

No there is still the single prefetch ring for all shards. But each prefetch slot has shard number. When we are waiting for this slot, we wait response from the particular shard. @jcsp said that we can have up to hundred of shards in future. Having separate ring for each shard in this case will be very inefficient.

pgxn/neon/libpagestore.c

pgxn/neon/pagestore_smgr.c

pgxn/neon/libpagestore.c

jcsp · 2023-12-12T12:11:27Z

@knizhnik found a couple more log lines that need to be made shard-aware:

2023-12-12 12:07:14.042 GMT [231054] ERROR:  could not read block 0 in rel 1664/0/1262.0 from page server at lsn 0/014FEDE8
2023-12-12 12:07:14.042 GMT [231054] DETAIL:  page server returned error: Request routed to wrong shard

(either in this PR or a followup)

jcsp · 2023-12-12T12:28:49Z

@knizhnik please cherry pick 2f2a865 to update the hashing to match what's in the pageserver. This is the change that enables initdb with a template to work.

knizhnik · 2023-12-12T13:27:26Z

@knizhnik please cherry pick 2f2a865 to update the hashing to match what's in the pageserver. This is the change that enables initdb with a template to work.

Done

knizhnik · 2023-12-12T13:48:19Z

@knizhnik found a couple more log lines that need to be made shard-aware:
2023-12-12 12:07:14.042 GMT [231054] ERROR:  could not read block 0 in rel 1664/0/1262.0 from page server at lsn 0/014FEDE8
2023-12-12 12:07:14.042 GMT [231054] DETAIL:  page server returned error: Request routed to wrong shard
(either in this PR or a followup)

Fixed

Co-authored-by: Sasha Krassovsky <[email protected]>

vadim2404 · 2023-12-19T16:38:17Z

@hlinnaka , @jcsp can you review it again? Because it seems that it's done

jcsp · 2023-12-19T16:53:33Z

I don't have any open threads, but I'm not the right person to green tick this -- I think @hlinnaka is the right approver for this.

vadim2404 · 2023-12-19T17:19:55Z

@hlinnaka , only you left 🙏

jcsp · 2023-12-19T19:08:34Z

This still isn't running CI:

Expected postgres-v14 rev to be at 'null', but it is at '03358bb0b5e0d33c238710139e768db9e75cfcc8'
Expected postgres-v15 rev to be at 'de8242c400f7870084861ac5796e0b5088b1898d', but it is at 'a2dc225ddfc8cae1849aa2316f435c58f0333d8c'
Please update vendors/revisions.json if these changes are intentional
Error: Process completed with exit code 1.

jcsp · 2023-12-19T19:19:02Z

The commit to tolerate empty pageserver strings is not making the test pass: it still hits an assertion:

TRAP: failed Assert("i > 0"), File: "/home/neon/neon//pgxn/neon/libpagestore.c", Line: 642, PID: 1407128

Please use the test branch that I DM'd you earlier today to test your changes.

…string

knizhnik · 2023-12-20T19:07:52Z

I failed to rebase this PR, so I have to replace it with new one: #6205

refer #5508 replaces #5837 ## Problem This PR implements sharding support at compute side. Relations are splinted in stripes and `get_page` requests are redirected to the particular shard where stripe is located. All other requests (i.e. get relation or database size) are always send to shard 0. ## Summary of changes Support of sharding at compute side include three things: 1. Make it possible to specify and change in runtime connection to more retain one page server 2. Send `get_page` request to the particular shard (determined by hash of page key) 3. Support multiple servers in prefetch ring requests ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <[email protected]> Co-authored-by: John Spray <[email protected]> Co-authored-by: Heikki Linnakangas <[email protected]>

knizhnik requested a review from a team as a code owner November 9, 2023 06:18

knizhnik requested review from conradludgate and removed request for a team November 9, 2023 06:18

vadim2404 requested a review from jcsp November 9, 2023 07:36

jcsp changed the title ~~Add support for PS shardoing in compute~~ Add support for PS sharding in compute Nov 9, 2023

jcsp added c/cloud/compute t/feature Issue type: feature, for new features or requests labels Nov 9, 2023

jcsp reviewed Nov 9, 2023

View reviewed changes

pgxn/neon/pagestore_client.h Outdated Show resolved Hide resolved

jcsp reviewed Nov 9, 2023

View reviewed changes

pgxn/neon/libpagestore.c Outdated Show resolved Hide resolved

vadim2404 reviewed Nov 9, 2023

View reviewed changes

pgxn/neon/control_plane_connector.c Outdated Show resolved Hide resolved

vadim2404 reviewed Nov 9, 2023

View reviewed changes

pgxn/neon/control_plane_connector.c Outdated Show resolved Hide resolved

knizhnik force-pushed the compute_sharding_support branch from 0aeba9f to 59ec475 Compare November 9, 2023 17:33

jcsp reviewed Nov 10, 2023

View reviewed changes

pgxn/neon/libpagestore.c Show resolved Hide resolved

jcsp reviewed Nov 10, 2023

View reviewed changes

pgxn/neon/libpagestore.c Outdated Show resolved Hide resolved

knizhnik force-pushed the compute_sharding_support branch from eedc4d1 to c6ac0a1 Compare November 29, 2023 10:07

knizhnik pushed a commit that referenced this pull request Dec 2, 2023

Merge with #5837

56cb91e

knizhnik force-pushed the compute_sharding_support branch from 11367b2 to 1e26621 Compare December 3, 2023 06:36

jcsp pushed a commit that referenced this pull request Dec 4, 2023

Merge with #5837

d547264

save-buffer reviewed Dec 4, 2023

View reviewed changes

pgxn/neon/libpagestore.c Outdated Show resolved Hide resolved

jcsp pushed a commit that referenced this pull request Dec 5, 2023

Merge with #5837

66fdb54

knizhnik force-pushed the compute_sharding_support branch from 8a6655a to 54a0b6a Compare December 8, 2023 15:01

hlinnaka reviewed Dec 9, 2023

View reviewed changes

save-buffer reviewed Dec 11, 2023

View reviewed changes

pgxn/neon/libpagestore.c Outdated Show resolved Hide resolved

pgxn/neon/pagestore_smgr.c Show resolved Hide resolved

pgxn/neon/libpagestore.c Outdated Show resolved Hide resolved

knizhnik force-pushed the compute_sharding_support branch from f994cb1 to 322dd3c Compare December 12, 2023 13:55

Konstantin Knizhnik and others added 12 commits December 19, 2023 09:22

Fix comments

2855074

Load shard map only at postmaster

38f64ea

Fix shard map reload mechanism

864d5f8

Fix shard map reload synchronization

86c76d4

[see #6052] make connection logging shard-aware

3853646

pgxn: amend key hashing

4e47d3a

Add [NEON_SMGR] to all messages produced by Neon exrtension

b465185

Add [NEON_SMGR] to all messages produced by Neon exrtension

5c8b3d9

Fix problem with stats collector at pg14

02abe3c

Update pgxn/neon/libpagestore.c

4cdfff7

Co-authored-by: Sasha Krassovsky <[email protected]>

Update pgxn/neon/libpagestore.c

151870b

Co-authored-by: Sasha Krassovsky <[email protected]>

Address review comments

b43ae6e

knizhnik force-pushed the compute_sharding_support branch from 9458996 to b43ae6e Compare December 19, 2023 07:22

Konstantin Knizhnik added 3 commits December 19, 2023 10:20

Merge with main

c2b3969

Allow empty connection string

80bf8d4

Bump Postgres version

767ce2b

vadim2404 mentioned this pull request Dec 19, 2023

Epic: compute support for sharding phase 1 #5508

Closed

3 tasks

vadim2404 requested a review from hlinnaka December 19, 2023 16:37

Konstantin Knizhnik added 2 commits December 19, 2023 22:17

Bump postgres version

338bf7a

Remove assert for non-zero number of shards from AssignPageserverConn…

8de5b63

…string

knizhnik mentioned this pull request Dec 20, 2023

Add support for PS sharding in compute #6205

Merged

5 tasks

knizhnik closed this Dec 20, 2023

stepashka added the c/compute Component: compute, excluding postgres itself label Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for PS sharding in compute #5837

Add support for PS sharding in compute #5837

knizhnik commented Nov 9, 2023 •

edited

Loading

github-actions bot commented Nov 9, 2023 •

edited

Loading

Postgres 16

Postgres 15

jcsp commented Dec 8, 2023

hlinnaka left a comment

hlinnaka Dec 9, 2023

knizhnik Dec 9, 2023

knizhnik Dec 9, 2023

jcsp commented Dec 12, 2023

jcsp commented Dec 12, 2023

knizhnik commented Dec 12, 2023

knizhnik commented Dec 12, 2023

vadim2404 commented Dec 19, 2023

jcsp commented Dec 19, 2023

vadim2404 commented Dec 19, 2023

jcsp commented Dec 19, 2023

jcsp commented Dec 19, 2023

knizhnik commented Dec 20, 2023

Add support for PS sharding in compute #5837

Add support for PS sharding in compute #5837

Conversation

knizhnik commented Nov 9, 2023 • edited Loading

Problem

Summary of changes

Checklist before requesting a review

Checklist before merging

github-actions bot commented Nov 9, 2023 • edited Loading

2184 tests run: 2099 passed, 0 failed, 85 skipped (full report)

Postgres 16

Postgres 15

Code coverage (full report)

jcsp commented Dec 8, 2023

hlinnaka left a comment

Choose a reason for hiding this comment

hlinnaka Dec 9, 2023

Choose a reason for hiding this comment

knizhnik Dec 9, 2023

Choose a reason for hiding this comment

knizhnik Dec 9, 2023

Choose a reason for hiding this comment

jcsp commented Dec 12, 2023

jcsp commented Dec 12, 2023

knizhnik commented Dec 12, 2023

knizhnik commented Dec 12, 2023

vadim2404 commented Dec 19, 2023

jcsp commented Dec 19, 2023

vadim2404 commented Dec 19, 2023

jcsp commented Dec 19, 2023

jcsp commented Dec 19, 2023

knizhnik commented Dec 20, 2023

knizhnik commented Nov 9, 2023 •

edited

Loading

github-actions bot commented Nov 9, 2023 •

edited

Loading