Skip to content

Commit

Permalink
Add new approach
Browse files Browse the repository at this point in the history
Signed-off-by: Shahryar Soltanpour <[email protected]>
Signed-off-by: Shahryar Soltanpour <[email protected]>
  • Loading branch information
sh-soltanpour committed Dec 28, 2023
1 parent b09953e commit 72403ad
Showing 1 changed file with 18 additions and 3 deletions.
21 changes: 18 additions & 3 deletions 001-cdc/001---cdc.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,23 @@ This image shows a high-level design of the system implemented in this approach.

For the subscriber/consumer part, we can try using the [Confluent PostgreSQL sink connector](https://docs.confluent.io/cloud/current/connectors/cc-postgresql-sink.html), and if there are any problems in the scale with this connector, we can consider replacing it with our in-house consumer in the future.

## 8. Selected Approach

Considering the pros and cons of the mentioned approaches, the best approach is to use Debezium as the underlying core of CDC. Our plugin can act as a wrapper around Debezium, allowing flexibility to replace it with another tool if needed with minimal effort.
### Approach 3: Implement our publisher and subscriber in Go, using the PostgreSQL's logical replication
In this approach, we will implement something similar to Debezium server/engine, but using Golang libraries such as [wal2json](https://github.com/eulerto/wal2json) and [wal-g](https://github.com/wal-g/wal-g).
The steps in this approach would be:
1. Using Wal-g, we create a base backup of the source database and restore it to the target database. From this point, the target database must be in read-only mode for users other than our plugin user.
2. Using wal2json, we create a stream of transactions from the source database from the point that the base backup was created.
3. A consumer reads the stream and applies the transactions to the target database.
4. From some point, we determine that the target database is somehow in sync with the source database, by somehow, I mean that there are less than a threshold number of transactions in the queue. At this point, we can say that databases are 'ready' for migration.
5. When databases are 'ready', and the user confirms, we need to put the source database in read-only mode and the consumer should process the remaining transactions until nothing is remaining.

During the time that the consumer is applying the remaining transaction, and the source DB is in read-only mode, there might be some write queries coming to the database. I can think of two ways to handle these queries:
1. Reject the queries and return an error to the client.
This approach will cause a very short glitch and downtime, but according to the fact that we already know that very few transactions remain, this downtime will be very short.

2. Make the write queries wait until the migration is done.
Again, based on the fact that the number of remaining transactions is very low, this approach will make sure are queries are processed and responded to but it may take a bit longer than the regular response time.
6. In this step, we have completely migrated to the target database and the source database can be removed from the system.

This proposal outlines the core features, implementation details, and potential paid offerings for the "CDC" plugin. The development and integration of this feature align with the goal of providing a seamless and efficient solution for database migration within the GatewayD framework.
## 8. Selected Approach
TBD

0 comments on commit 72403ad

Please sign in to comment.