Skip to content

Commit

Permalink
Timeline offloading persistence (#9444)
Browse files Browse the repository at this point in the history
Persist timeline offloaded state to S3.

Right now, as of #8907, at each restart of the pageserver, all offloaded
state is lost, so we load the full timeline again. As it starts with an
empty local directory, we might potentially download some files again,
leading to downloads that are ultimately wasteful.

This patch adds support for persisting the offloaded state, allowing us
to never load offloaded timelines in the first place. The persistence
feature is facilitated via a new file in S3 that is tenant-global, which
contains a list of all offloaded timelines. It is updated each time we
offload or unoffload a timeline, and otherwise never touched.

This choice means that tenants where no offloading is happening will not
immediately get a manifest, keeping the change very minimal at the
start.

We leave generation support for future work. It is important to support
generations, as in the worst case, the manifest might be overwritten by
an older generation after a timeline has been unoffloaded (and
unarchived), so the next pageserver process instantiation might wrongly
believe that some timeline is still offloaded even though it should be
active.

Part of #9386, #8088
  • Loading branch information
arpad-m authored Oct 22, 2024
1 parent fcb55a2 commit 6f8fcdf
Show file tree
Hide file tree
Showing 11 changed files with 637 additions and 80 deletions.
2 changes: 2 additions & 0 deletions libs/pageserver_api/src/models.rs
Original file line number Diff line number Diff line change
Expand Up @@ -699,6 +699,8 @@ pub struct OffloadedTimelineInfo {
pub ancestor_timeline_id: Option<TimelineId>,
/// Whether to retain the branch lsn at the ancestor or not
pub ancestor_retain_lsn: Option<Lsn>,
/// The time point when the timeline was archived
pub archived_at: chrono::DateTime<chrono::Utc>,
}

/// This represents the output of the "timeline_detail" and "timeline_list" API calls.
Expand Down
2 changes: 2 additions & 0 deletions pageserver/src/http/routes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -486,13 +486,15 @@ fn build_timeline_offloaded_info(offloaded: &Arc<OffloadedTimeline>) -> Offloade
timeline_id,
ancestor_retain_lsn,
ancestor_timeline_id,
archived_at,
..
} = offloaded.as_ref();
OffloadedTimelineInfo {
tenant_id: tenant_shard_id,
timeline_id,
ancestor_retain_lsn,
ancestor_timeline_id,
archived_at: archived_at.and_utc(),
}
}

Expand Down
Loading

1 comment on commit 6f8fcdf

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5238 tests run: 5023 passed, 1 failed, 214 skipped (full report)


Failures on Postgres 17

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_deletion_queue_recovery[release-pg17-no-validate-lose]"
Flaky tests (1)

Postgres 17

Test coverage report is not available

The comment gets automatically updated with the latest test results
6f8fcdf at 2024-10-22T21:41:15.267Z :recycle:

Please sign in to comment.