Make it possible to detemine WAL format at runtime #304

knizhnik · 2023-08-14T08:16:54Z

See neondatabase/neon#4761 (comment)
and https://neondb.slack.com/archives/C02712LTZKQ/p1691761757752739?thread_ts=1691760329.116189&cid=C02712LTZKQ

Most significant changes are: - `xlog.c` refactoring - some code was moved to `xlogreader.c` and `xlogprefetcher.c`. - `ThisTimeLineID` refactoring (4a92a1c and e997a0c), which affects walproposer code - `XLogFileInit` refactoring, Multiple commits changed the function signature. - resolve initdb and pg_waldump neon-specific options that conflictes with the ones from PostgreSQL. -

…t, which is used for safekeepers-sync

* Move backpressure throttling implementation to neon extension and function for monitoring throttling time * Update src/include/miscadmin.h Co-authored-by: Heikki Linnakangas <[email protected]> Co-authored-by: Heikki Linnakangas <[email protected]>

Disabled by default. The plan is to merge this now, so that we can do performance testing quickly, and if it helps, rewrite and review it properly. Author: Konstantin Knizhnik

Co-authored-by: Konstantin Knizhnik <[email protected]>

Commit a703269 replaced $(INSTALL) with plain "cp" for installing the server header files. It sped up "make install" significantly, because the old logic called $(INSTALL) separately for every header file, whereas plain "cp" could copy all the files in one command. However, we have long since made it a requirement that $(INSTALL) can also install multiple files in one command, see commit f1c5247. Switch back to $(INSTALL). Discussion: https://www.postgresql.org/message-id/200503252305.j2PN52m23610%40candle.pha.pa.us Discussion: https://www.postgresql.org/message-id/2415283.1641852217%40sss.pgh.pa.us

…alue of enable_seqscan_prefetch

to support only extensions that were built against Neon PostgreSQL

Neon generates PG_VERSION files in one format - just major version number without newline. Be consistent with it

No need to perform WAL recovery in Neon Co-authored-by: Konstantin Knizhnik <[email protected]>

…ion because spec_token is not wal logged (#223) * Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged refer ##2587 * Update src/backend/access/heap/heapam.c Co-authored-by: Heikki Linnakangas <[email protected]> Co-authored-by: Heikki Linnakangas <[email protected]>

* Fix shared memory initialization for last written LSN cache Replace (from,till) with (from,n_blocks) for SetLastWrittenLSNForBlockRange function * Fast exit from SetLastWrittenLSNForBlockRange for n_blocks == 0

…ForBlockRange (#229)

@localhost

Without this patch, on bootstrap XLP_FIRST_IS_CONTRECORD has been always put on header of a page where WAL writing continues. This confuses WAL decoding on safekeepers, making it think decoding starts in the middle of a record, leading to 2022-08-12T17:48:13.816665Z ERROR {tid=37}: query handler for 'START_WAL_PUSH postgresql://no_user:@localhost:15050' failed: failed to run ReceiveWalConn Caused by: 0: failed to process ProposerAcceptorMessage 1: invalid xlog page header: unexpected XLP_FIRST_IS_CONTRECORD at 0/2CF8000 Rebase of a1af529 for v14.

- Refactor the way the WalProposerMain function is called when started with --sync-safekeepers. The postgres binary now explicitly loads the 'neon.so' library and calls the WalProposerMain in it. This is simpler than the global function callback "hook" we previously used. - Move the WAL redo process code to a new library, neon_walredo.so, and use the same mechanism as for --sync-safekeepers to call the WalRedoMain function, when launched with --walredo argument. - Also move the seccomp code to neon_walredo.so library. I kept the configure check in the postgres side for now, though.

Fix indentation, remove unused definitions, resolve some FIXMEs.

Previously, we called PrefetchBuffer [NBlkScanned * seqscan_prefetch_buffers] times in each of those situations, but now only NBlkScanned. In addition, the prefetch mechanism for the vacuum scans is now based on blocks instead of tuples - improving the efficiency.

Parallel seqscans didn't take their parallelism into account when determining which block to prefetch, and vacuum's cleanup scan didn't correctly determine which blocks would need to be prefetched, and could get into an infinite loop.

* Use prefetch in pg_prewarm extension * Change prefetch order as suggested in review

* Update prefetch mechanisms: - **Enable enable_seqscan_prefetch by default** - Store prefetch distance in the relevant scan structs - Slow start sequential scan, to accommodate LIMIT clauses. - Replace seqscan_prefetch_buffer with the relations' tablespaces' *_io_concurrency; and drop seqscan_prefetch_buffer as a result. - Clarify enable_seqscan_prefetch GUC description - Fix prefetch in pg_prewarm - Add prefetching to autoprewarm worker - Fix an issue where we'd incorrectly not prefetch data when hitting a table wraparound. The same issue also resulted in assertion failures in debug builds. - Fix parallel scan prefetching - we didn't take into account that parallel scans have scan synchronization, too.

#245) * Maintain last written LSN for each page to enable prefetch on vacuum, delete and other massive update operations * Move PageSetLSN in heap_xlog_visible before MarkBufferDirty

- Prefetch the pages in index vacuum's sequential scans Implemented in NBTREE, GIST and SP-GIST. BRIN does not have a 2nd phase of vacuum, and both GIN and HASH clean up their indexes in a non-seqscan fashion: GIN scans the btree from left to right, and HASH only scans the initial buckets sequentially.

* Show prefetch statistic in EXPLAIN refer #2994 * Update heap pge LSN in case of VM changes even if wal_redo_hints=off refer #2807 * Undo occasional changes * Undo occasional changes

* Show prefetch statistic in EXPLAIN refer #2994 * Collect per-node prefetch statistics * Show number of prefetch duplicates in explain

* Implement efficient prefetch for parallel bitmap heap scan * Change MAX_IO_CONCURRENCY to be power of 2

* Avoid errors when accessing indexes of unlogge tables after compute restart * Support unlogged sequences * Extract sequence start value from pg_sequence * Initialize unlogged index undex eclusive lock

They will be handled in pageserver, ref neondatabase/neon#3706 This reverts commit ad5e789 This reverts commit 46c44e8 This does *not* revert commit 285cd13. We likely should do that, but check_restored_datadir_content complains in some diff in init fork contents after test_pg_regress, this should be sorted out.

written LSN cache optional.

Now similar kind of hack (using malloc() instead of shmem) is done in the wal-redo extension.

* Adjust prefetch target for parallel bitmap scan * More fixes for parallel bitmap scan prefetch

* Pefeth for index and inex-only scans * Remove debug logging * Move prefetch_blocks array to the end of BTScanOpaqueData struct

* Recovery requirements: Add condition variable for WAL recovery; allowing backends to wait for recovery up to some record pointer. * Fix issues w.r.t. WAL when LwLsn is initiated and when recovery starts. This fixes some test failures that showed up after updating Neon code to do more precise handling of replica's get_page_at_lsn's request_lsn lsns. --------- Co-authored-by: Matthias van de Meent <[email protected]>

groups.

Co-authored-by: Konstantin Knizhnik <[email protected]>

* Make it possible to grant self created roles * Update expected file for create_role test --------- Co-authored-by: Konstantin Knizhnik <[email protected]>

…extetnded Neon SMGR API (#300) Co-authored-by: Konstantin Knizhnik <[email protected]>

…m xl_multi_insert)_tuple to xl_multi_insert

lubennikovaav and others added 30 commits August 9, 2023 15:31

fix regression tests

8af27f3

rebase to the latest origin and resolve conflicts

c146655

Remove contrib neon and neon_test_utils.

a41058d

Prevent access to uninitalized shaerd memory in InstallXLogFileSegmen…

e06d8c7

…t, which is used for safekeepers-sync

Remove Dockerfile, it's now in the neon repo

9b2b574

Merge last written cache lsn with new main branch (#201)

f5cb05b

Local prefetch implementation for Postgres 15

80512f1

Disabled by default. The plan is to merge this now, so that we can do performance testing quickly, and if it helps, rewrite and review it properly. Author: Konstantin Knizhnik

Set last written LSN for the created relation (#212)

54f74ad

Co-authored-by: Konstantin Knizhnik <[email protected]>

Update expected output for sysviews test because of changed default v…

5079178

…alue of enable_seqscan_prefetch

Undo diasming VM check warning in vacuumlazy.c (#214)

95f6956

Set Neon-specific FMGR_ABI_EXTRA

561af18

to support only extensions that were built against Neon PostgreSQL

Don't use newline in PG_VERSION file.

4ac76c3

Neon generates PG_VERSION files in one format - just major version number without newline. Be consistent with it

Unset ArchiveRecoveryRequested for Neon code path.

a771474

No need to perform WAL recovery in Neon Co-authored-by: Konstantin Knizhnik <[email protected]>

Fix memory leak in ApplyRecord

faa68d0

Rebase to Stamp 15.0

808318e

Fix shared memory initialization for last written LSN cache (#226)

534b38a

* Fix shared memory initialization for last written LSN cache Replace (from,till) with (from,n_blocks) for SetLastWrittenLSNForBlockRange function * Fast exit from SetLastWrittenLSNForBlockRange for n_blocks == 0

Fix upper boundary caculation in the chunks loop in SetLastWrittenLSN…

887fd35

…ForBlockRange (#229)

Misc cleanup, mostly to reduce unnecessary differences with upstream.

f34afc9

Fix indentation, remove unused definitions, resolve some FIXMEs.

Fix expected results for regression tests (#238)

90c4f24

Use prefetch in pg_prewarm extension (#237)

890a699

* Use prefetch in pg_prewarm extension * Change prefetch order as suggested in review

Drop unlogged table in regress test to avoid noise in tests

debb925

knizhnik and others added 24 commits August 9, 2023 15:31

Maintain last written LSN for each page to enable prefetch on vacuum,… (

6f72c84

#245) * Maintain last written LSN for each page to enable prefetch on vacuum, delete and other massive update operations * Move PageSetLSN in heap_xlog_visible before MarkBufferDirty

Set lsn fix v15 (#252)

fc754e5

* Show prefetch statistic in EXPLAIN refer #2994 * Update heap pge LSN in case of VM changes even if wal_redo_hints=off refer #2807 * Undo occasional changes * Undo occasional changes

Show prefetch statistic in EXPLAIN (#249)

7cb2db7

* Show prefetch statistic in EXPLAIN refer #2994 * Collect per-node prefetch statistics * Show number of prefetch duplicates in explain

Implement efficient prefetch for parallel bitmap heap scan (#258)

a52fb5a

* Implement efficient prefetch for parallel bitmap heap scan * Change MAX_IO_CONCURRENCY to be power of 2

Unlogged index fix v15 (#262)

9fc4107

* Avoid errors when accessing indexes of unlogge tables after compute restart * Support unlogged sequences * Extract sequence start value from pg_sequence * Initialize unlogged index undex eclusive lock

Fix bitmap scan prefetch (#261)

7b55f30

Allow external main functions to skip config load and make last

6da0174

written LSN cache optional.

Remove walredo-related hacks from InternalIpcMemoryCreate()

a029c23

Now similar kind of hack (using malloc() instead of shmem) is done in the wal-redo extension.

Adjust prefetch target for parallel bitmap scan (#274)

78481ee

* Adjust prefetch target for parallel bitmap scan * More fixes for parallel bitmap scan prefetch

Copy iterator result in BitmapHeapNext (#276)

8a6fd67

Prefetch for index and index-only scans (#271)

0e771f9

* Pefeth for index and inex-only scans * Remove debug logging * Move prefetch_blocks array to the end of BTScanOpaqueData struct

Fix entering hot standby mode for Neon

48ebdaf

Do not allow users with CREATEROLE privelege to manage system user

ac6ca4d

groups.

Fix regression tests after the patch with CREATEROLE restrictions

38e3bd2

Add startup logs (#293)

f93c725

Make it possible to grant self created roles (#298)

4834eb7

Co-authored-by: Konstantin Knizhnik <[email protected]>

Update expected file for create_role test (#301)

362e841

* Make it possible to grant self created roles * Update expected file for create_role test --------- Co-authored-by: Konstantin Knizhnik <[email protected]>

Define NEON_SMGR in smgr.h to make it possible for extensions to use …

fcd0bde

…extetnded Neon SMGR API (#300) Co-authored-by: Konstantin Knizhnik <[email protected]>

Request extension files and libraries from compute_ctl

026d6b0

Make it possible to detemine WAL format at runtime

bd06d52

Make handling t_cid more clear and less erro prone and move t_cid fro…

f6ea27b

…m xl_multi_insert)_tuple to xl_multi_insert

knizhnik mentioned this pull request Aug 15, 2023

Handle both Vanilla and Neon WAL formats neondatabase/neon#4991

Open

5 tasks

Support work with both multi_insert record formats in compatibility mode

2c76abf

tristan957 force-pushed the REL_15_STABLE_neon branch from bc88f53 to 24333ab Compare December 11, 2023 16:12

tristan957 force-pushed the REL_15_STABLE_neon branch from 81e16cd to 6ee78a3 Compare February 7, 2024 16:39

tristan957 force-pushed the REL_15_STABLE_neon branch 2 times, most recently from 3be8940 to e2dbd63 Compare May 20, 2024 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it possible to detemine WAL format at runtime #304

Make it possible to detemine WAL format at runtime #304

knizhnik commented Aug 14, 2023

Make it possible to detemine WAL format at runtime #304

Are you sure you want to change the base?

Make it possible to detemine WAL format at runtime #304

Conversation

knizhnik commented Aug 14, 2023