Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve multi-versioning persistence problem #412

Open
miloszm opened this issue Dec 5, 2024 · 0 comments
Open

Solve multi-versioning persistence problem #412

miloszm opened this issue Dec 5, 2024 · 0 comments
Assignees

Comments

@miloszm
Copy link
Contributor

miloszm commented Dec 5, 2024

Summary

Currently, each commit holds its own Merkle-tree positions file, which incurs overhead.
A scheme for eliminating this extra storage is needed to reduce overhead.

Possible solution design or implementation

The solution could be as follows: modify the "main" data section in such a way that parts which belong to outstanding commits are preserved. This solution involves invalidating sessions which are in progress. A scheme of delaying such modification to a time between sessions or to multiply the parts so that they do not conflict needs to be devised.
To be more precise: when finalising, some memory files in "main" data section will be overwritten,
yet they still belong to some active commits. This is easy to alleviate by copying these "main" files to commit-specific
data segments (only baseless commits or commits whose base is the current commit need to be considered).
Caveat is that after such copying we need to somehow refresh our "contract" objects that may be part of a session.
This can only be done effectively between sessions (or at session start-up, to be exact),
and not while session is running. Hence, changes need to be delayed or arranged in such a way that they do not affect running sessions and are able to "wait" until session start-up. Another, much preferred, solution would be to leave the memory files in place and duplicate them to some other location which will from now on be considered the "main". This would require a mechanism for a floating (or moving) "main" surface (or edge) which might also be a feasible solution, eliminating the need of delays and of dealing with sessions. Yet such floating "main" comes at a cost of additional complexity in the storage scheme.

Facit

After consideration - the moving "main" edge seems to be the direction to take. It needs to be elaborated in a form of exact specifications, which are the subject of this issue.

Additional explanation

Currently, main "edge" is flat, it consists of the contents of the "main"/"memory" folder. We need to devise is a scheme that allows the "edge" to live in temporary subfolders created on demand and which exists as long as the outstanding commits that need it are still alive. Should outstanding commit be finalised, the edge can be collapsed, yet we need to be careful about other commits that may rely on it in the meantime.
It is probably much simpler to have an edge which never collapses, with periodic garbage collection clearing the old layers. Our edge would basically always "move on", so it would be a "moving" edge rather than a "floating" edge.
Each commit could have an "edge level" assigned to it, and the commit would stick to this level as long as it lives.
Once there are no commits using a given level, the level can be rubbed out.

Additional context

This is a future enhancement

@miloszm miloszm self-assigned this Dec 5, 2024
@miloszm miloszm changed the title Detach commit id from commit root Solve multi-versioning persistence problem Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant