-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental backup #22
Comments
@Jubilee101 If I have any questions during the process, I will post them here along with any discussions. |
Sounds good! Just let us know if you have ideas or problems. |
I have researched In The implementation follows strict state management and error-handling practices, ensuring reliable communication between the client and server. It also supports asynchronous communication, allowing for complex multi-session interactions. Additionally, there are two packed interfaces, In summary, I will try to become familiar with our project's communication methods and explore using a protocol directly to send the manifest file, avoiding the need to create a table on the server side. |
Hmm, are they using this protocol to send manifest and receive backups? This is a bit weird since I don't recall having to install pgbackrest on both sides. I don't think it's appropriate to setup connections on the extension side, it seems like a weird hacking of postgres extensions. Some other options would be A. we send manifest entry one query at a time, B. the extension sends manifest as a query response, one row per entry, and we do comparison on main. There is a chance that we don't need manifest to know what changed, I'm looking at how postgres does it, and it seems that they use WAL purely. The idea is to get a WAL summary from last backup's start LSN to current backup's start LSN, files mentioned in WAL in the range are naturally subject to incremental backup. And here's a nice thing, we are going to have our WAL support within this GSoC. And we stream WAL segments to main all the time. So the whole "comparison"(it's not exactly a comparison now) could happen entirely on main side. Either way, we need to know what the start LSN (and end LSN as well, but one thing at a time) is for current backup. And it is not easy, since I think postgres gets it by inducing a checkpoint, which I'm not sure if we have the authority to do. I think you can put a pin on sending manifest while Jesper and I think about the best way to address this, and look into getting the start LSN on the extension side. I think what pgbackrest does is to wait until postgres does its next regular checkpoint. But it seems that it also has an option to start immediately(--start-fast). I'm interested to know what they are doing underneath. We are still in exploring phase of a hard problem so I think it is OK that things are slow -- postgres itself waits until v17 🤷♂️ . We can also get support for v17 going at the same time. Anyway, let us if you have better ideas or problems, as always. |
I think we should think about
in a deque (memory cached). That can be refreshed, and in most cases We should keep the last backup information cached such that is it easier to make a "diff"... However, like @Jubilee101 pointed out, LSN is the important factor, so that is the driver. The above is an optimization. I like B. |
For B, we could create a function similar to The concern is that if the result is too large, we might encounter issues with this approach. It's uncertain whether it would still work effectively in such cases. |
We should start with the This is a very big task, so we need to break it down - while we consider very large database clusters |
Talk to @Jubilee101 to port the core data structures over such that building the JSON document is easy |
So, each time we perform a backup, we can read the |
Work on getting a GUC in place such that checksums are calculated based on the setting |
We can have "latest backup", and current checksums... we really need to know what has changed since the last backup |
But, core/ does the backup, so you need a way to receive the "official" manifest and keep it |
The core idea is that we can't afford to wait to calculate checksums on a PB-scale database cluster, so we need to keep the checksum current in memory. Most of the files won't change, so that is "guarded" by the file timestamp check, so each iteration should be "minimal" |
WAL summary sounds promising but it needs some more time. So for now maybe focus on porting the data structures(should be easy copy & paste), and getting the start/end LSN. As for the checksum cache, it's an optimization which is nice to have. So look at it if you have time, but it doesn't sound like top priority to me. |
Since the school year has started, I might not always be able to respond promptly. However, I'll do my best to stay on top of the project, and I'll do my best to complete it. I'm a little confused right now because I don't have a rough outline of the entire process. Could you clarify where we should define the data structures and in which file? I did some research on retrieving the start/end LSN and found an internal PostgreSQL function, I think this might be the LSN we need if we use it before the backup. If we define a custom function, We can execute the logic on the main side:
Is this logic and my understanding correct? |
That is quite ok, just keep us posted every week and let us know if you get distracted by other business.
I'm thinking in main it should be an option in backup, something like As for the major backup process, it should still happen in wf_backup.c, maybe a new set of incrementalbackup_setup/execute/teardown. You don't have to worry about them now. Just have a rough picture in mind and focus on getting the APIs we are going to need in. We'll get to the assemble in the end.
Great! Have you checked internally how it works? According to https://www.interdb.jp/pg/pgsql10/01.html#1011-pg_backup_start-ver14-or-earlier-pg_start_backup, it needs to do a checkpoint first. That's what we worry about, since doing a checkpoint requires a pretty hight level of privilege. I guess that's why pgbackrest by defaults waits until a regular checkpoint to happen naturally. But they do seem to have some ways to start immediately. |
Thank you for your explanation; it's much clearer now, and I have a rough outline. If there are any specific APIs you need, just create an issue, write down the requirements, and I'll do my best to implement them. As for I believe the structure in PostgreSQL, XLogCtlData, is always maintained in shared memory, which tracks various aspects of WAL processing. If we define our own function in the extension and use it, without restricting the privilege, the |
Should we create the function |
Yes, we can try and see if it works. We can always remove as part of a cleanup before the release |
I'm starting to research and attempt to implement incremental backup. I plan to achieve this feature through three PRs:
The text was updated successfully, but these errors were encountered: