layer file creation: fatal_err on timeline dir fsync #6985

problame · 2024-03-01T12:41:33Z

As pointed out in the comments added in this PR:
the in-memory state of the filesystem already has the layer file in its final place.
If the fsync fails, but pageserver continues to execute, it's quite easy
for subsequent pageserver code to observe the file being there and
assume it's durable, when it really isn't.

It can happen that we get ENOSPC during the fsync.
However,

the timeline dir is small (remember, the big layer file has already been synced).
Small data means ENOSPC due to delayed allocation races etc are less likely.
what else are we going to do in that case?

If we decide to bubble up the error, the file remains on disk.
We could try to unlink it and fsync after the unlink.
If that fails, we would definitely need to error out.
Is it worth the trouble though?

Side note: all this logic about not carrying on after fsync failure
implies that we sync the filesystem successfully before we restart
the pageserver. We don't do that right now, but should (=> #6989)

part of #6663

The `writer.finish()` methods already fsync the inode, using `VirtualFile::sync_all()`. All that the callers need to do is fsync their directory, i.e., the timeline directory. Note that there's a call in the new compaction code that is apparently dead-at-runtime, so, I couldn't fix up any fsyncs there [Link](https://github.com/neondatabase/neon/blob/502b69b33bbd4ad1b0647e921a9c665249a2cd62/pageserver/src/tenant/timeline/compaction.rs#L204-L211). In the grand scheme of things, layer durability probably doesn't matter anymore because the remote storage is authoritative at all times as of #5198. But, let's not break the discipline in htis commit. part of #6663

As pointed out in the comments added in this PR: the in-memory state of the filesystem already has the layer file in its final place. If the fsync fails, but pageserver continues to execute, it's quite easy for subsequent pageserver code to observe the file being there and assume it's durable, where it really isn't. It can happen that we get ENOSPC during the fsync. However, 1. the timeline dir is small (remember, the big layer _file_ has already been synced). Small data means ENOSPC due to delayed allocation races etc are less likely. 2. what elase are we going to do in that case? If we decide to bubble up the error, the file remains on disk. We could try to unlink it and fsync after the unlink. If that fails, we would _definitely_ need to error out. Is it worth the trouble though? Side note: all this logic about not carrying on after fsync failure implies that we `sync` the filesystem successfully before we restart the pageserver. Our systemd unit currently does not do that, but should.

…kio-epoll-uring/layer-write-path-fsync-cleanups

…sync-cleanups' into problame/integrate-tokio-epoll-uring/create-layer-fatal-err-on-fsync

koivunej

This is looking fine. Because we haven't yet added these layers to the upload queue and then uploaded a new version of index_part.json, we will delete them on the next restart (and not fsync while doing that).

github-actions · 2024-03-01T13:23:32Z

2484 tests run: 2362 passed, 0 failed, 122 skipped (full report)

Flaky tests (1)

Postgres 16

test_crafted_wal_end[last_wal_record_crossing_segment]: release

Code coverage* (full report)

functions: 28.7% (6936 of 24179 functions)
lines: 47.2% (42530 of 90144 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
00bf050 at 2024-03-04T12:26:25.853Z :recycle:}

…kio-epoll-uring/layer-write-path-fsync-cleanups

…sync-cleanups' into problame/integrate-tokio-epoll-uring/create-layer-fatal-err-on-fsync

…layer-fatal-err-on-fsync

problame added 4 commits March 1, 2024 11:56

Merge remote-tracking branch 'origin/main' into problame/integrate-to…

ce251ac

…kio-epoll-uring/layer-write-path-fsync-cleanups

Merge branch 'problame/integrate-tokio-epoll-uring/layer-write-path-f…

c4f7a19

…sync-cleanups' into problame/integrate-tokio-epoll-uring/create-layer-fatal-err-on-fsync

problame requested review from jcsp and koivunej March 1, 2024 12:41

problame requested a review from a team as a code owner March 1, 2024 12:41

problame mentioned this pull request Mar 1, 2024

Epic: adopt tokio-epoll-uring on the write path #6663

Closed

koivunej approved these changes Mar 1, 2024

View reviewed changes

jcsp approved these changes Mar 1, 2024

View reviewed changes

problame added 2 commits March 1, 2024 15:25

Merge remote-tracking branch 'origin/main' into problame/integrate-to…

5528b16

…kio-epoll-uring/layer-write-path-fsync-cleanups

Merge branch 'problame/integrate-tokio-epoll-uring/layer-write-path-f…

c972d17

…sync-cleanups' into problame/integrate-tokio-epoll-uring/create-layer-fatal-err-on-fsync

Base automatically changed from problame/integrate-tokio-epoll-uring/layer-write-path-fsync-cleanups to main March 4, 2024 11:33

Merge branch 'main' into problame/integrate-tokio-epoll-uring/create-…

00bf050

…layer-fatal-err-on-fsync

problame enabled auto-merge (squash) March 4, 2024 11:38

problame merged commit c861d71 into main Mar 4, 2024
50 checks passed

problame deleted the problame/integrate-tokio-epoll-uring/create-layer-fatal-err-on-fsync branch March 4, 2024 12:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

layer file creation: fatal_err on timeline dir fsync #6985

layer file creation: fatal_err on timeline dir fsync #6985

problame commented Mar 1, 2024 •

edited

Loading

koivunej left a comment

github-actions bot commented Mar 1, 2024 •

edited

Loading

Postgres 16

layer file creation: fatal_err on timeline dir fsync #6985

layer file creation: fatal_err on timeline dir fsync #6985

Conversation

problame commented Mar 1, 2024 • edited Loading

koivunej left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 1, 2024 • edited Loading

2484 tests run: 2362 passed, 0 failed, 122 skipped (full report)

Postgres 16

Code coverage* (full report)

problame commented Mar 1, 2024 •

edited

Loading

github-actions bot commented Mar 1, 2024 •

edited

Loading