Skip to content

Commit

Permalink
Emit nbtree vacuum cycle id in nbtree xlog through FPIs
Browse files Browse the repository at this point in the history
NBTree needs a vacuum cycle ID on pages of whom the split resulted in a new
right page that is located before the original page, or who were split from
such split pages in the current vacuum cycle. By WAL-logging the cycle_id
and restoring it in recovery, we assure vacuum doesn't fail to clean up the
earlier pages.

This fixes the PostgreSQL side of neondatabase/neon#9929.
  • Loading branch information
MMeent committed Nov 29, 2024
1 parent 3c15b65 commit 31f850d
Show file tree
Hide file tree
Showing 2 changed files with 44 additions and 1 deletion.
29 changes: 28 additions & 1 deletion src/backend/access/nbtree/nbtinsert.c
Original file line number Diff line number Diff line change
Expand Up @@ -1494,6 +1494,7 @@ _bt_split(Relation rel, Relation heaprel, BTScanInsert itup_key, Buffer buf,
bool newitemonleft,
isleaf,
isrightmost;
uint16 old_cycleid;

/*
* origpage is the original page to be split. leftpage is a temporary
Expand Down Expand Up @@ -1559,6 +1560,8 @@ _bt_split(Relation rel, Relation heaprel, BTScanInsert itup_key, Buffer buf,
/* handle btpo_next after rightpage buffer acquired */
lopaque->btpo_level = oopaque->btpo_level;
/* handle btpo_cycleid after rightpage buffer acquired */
/* NEON: store the page's former cycle ID for FPI check later */
old_cycleid = oopaque->btpo_cycleid;

/*
* Copy the original page's LSN into leftpage, which will become the
Expand Down Expand Up @@ -1981,7 +1984,31 @@ _bt_split(Relation rel, Relation heaprel, BTScanInsert itup_key, Buffer buf,
XLogBeginInsert();
XLogRegisterData((char *) &xlrec, SizeOfBtreeSplit);

XLogRegisterBuffer(0, buf, REGBUF_STANDARD);
/*
* NEON: If we split to earlier pages during a btree vacuum cycle,
* then we have to include the cycle ID in the WAL record. The
* easiest method to do that is to force an image, which happens to
* be relatively cheap, as the data already contained in the record is
* enough to populate the new right page.
*
* We MUST log an FPI when:
* - The right page's blckno < the left page's blckno
* - The right page might be 'C' in a page spit chain B > C > A after
* B split B > A => B > C > A; or B > C > D > A, etc. (as indicated
* by the presense of a cycle ID).
*/
if (lopaque->btpo_cycleid == 0 || (rightpagenumber > origpagenumber &&
lopaque->btpo_cycleid != old_cycleid))
{
/* no cycle ID is required */
XLogRegisterBuffer(0, buf, REGBUF_STANDARD);
}
else
{
/* cycle ID is required */
XLogRegisterBuffer(0, buf, REGBUF_FORCE_IMAGE | REGBUF_STANDARD);
}

XLogRegisterBuffer(1, rbuf, REGBUF_WILL_INIT);
/* Log original right sibling, since we've changed its prev-pointer */
if (!isrightmost)
Expand Down
16 changes: 16 additions & 0 deletions src/backend/access/nbtree/nbtxlog.c
Original file line number Diff line number Diff line change
Expand Up @@ -431,6 +431,22 @@ btree_xlog_split(bool newitemonleft, XLogReaderState *record)
PageSetLSN(origpage, lsn);
MarkBufferDirty(buf);
}
else
{
/*
* btree split FPIs may contain important cycle IDs on the original
* page's FPI; make sure we correctly transfer this over
*/
Page opage;
BTPageOpaque oopaque;

Assert(BufferIsValid(buf));

opage = BufferGetPage(buf);
oopaque = BTPageGetOpaque(opage);

ropaque->btpo_cycleid = oopaque->btpo_cycleid;
}

/* Fix left-link of the page to the right of the new right sibling */
if (spagenumber != P_NONE)
Expand Down

0 comments on commit 31f850d

Please sign in to comment.