You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that neither of time|space_cutoff/latest_gc_cutoff are correct for determining the gc horizon for gc-compaction. Currently, it did not consider the case where the child branch has a retention period that is lower than the parent branch gc horizon. This will cause child branch not able to branch off within the retention period if gc-compaction is enabled, plus logical size computation failures.
main ------------
| |
child | ^---------
| ^now
^now-24hr is on the parent branch
The text was updated successfully, but these errors were encountered:
Seems like we have a tenant-level PiTR, but we still get into races where the gc_horizon obtained in gc-compaction would cause logical size computation to stuck. I assume it's because we are not using the same SystemTime::now() for all branches when computing the horizon.
Looking at the staging tenant, the race can be observed: the global cutoff is set to 86400s, so ideally we should get exactly the same time cutoff LSNs for all branches. But we get two different time cutoff LSNs on two branches:
2024-12-17T15:57:30.232055Z INFO gc_loop{tenant_id=12fd6e6d7a50bf7dd96154ec39b8b7c8 shard_id=0000}:run:gc_timeline{timeline_id=2365c8af38983a9c48e6e4df0c5ae767 cutoff=0/E4B979D0}: Nothing to GC: new_gc_cutoff_lsn 0/E4B979D0, latest_gc_cutoff_lsn 0/E4B979D0
2024-12-17T15:57:30.232041Z INFO gc_loop{tenant_id=12fd6e6d7a50bf7dd96154ec39b8b7c8 shard_id=0000}:run:gc_timeline{timeline_id=9136e295b2647dae2fc5e2a2abbb1dc6 cutoff=0/E4B96D18}: Nothing to GC: new_gc_cutoff_lsn 0/E4B96D18, latest_gc_cutoff_lsn 0/E4B96D18
And therefore causing a missing key error when computing logical size after running gc-horizon on these latest cutoff LSNs:
2024-12-18T21:35:46.442792Z ERROR synthetic_size_worker: failed to calculate synthetic size for tenant 12fd6e6d7a50bf7dd96154ec39b8b7c8: could not find data for key 010000000000000000000000000000000000 (shard ShardNumber(0)) at LSN 0/E4B839F1, request LSN 0/E4B839F0, ancestor 0/0
It seems that neither of time|space_cutoff/latest_gc_cutoff are correct for determining the gc horizon for gc-compaction. Currently, it did not consider the case where the child branch has a retention period that is lower than the parent branch gc horizon. This will cause child branch not able to branch off within the retention period if gc-compaction is enabled, plus logical size computation failures.
The text was updated successfully, but these errors were encountered: