Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot publishing is incorrect and inefficient #1

Open
cer opened this issue Jun 20, 2018 · 1 comment
Open

Snapshot publishing is incorrect and inefficient #1

cer opened this issue Jun 20, 2018 · 1 comment

Comments

@cer
Copy link
Contributor

cer commented Jun 20, 2018

There are two problems with the current approach:

  • Publishing SnapshotOffsetEvent to Kafka doesn't handle the scenario where there are not-yet published Customer/Order events in the database (binlog). Consequently, the SnapshotOffsetEvent might be followed by those events. The view service will consume those events and be confused.

  • Publishing Customer/OrderSnapshot events via the database is inefficient. Much better to publish directly to Kafka

I propose the following algorithm:

  1. Lock tables
  2. Get binlog offset from database - can this be done for MySQL and Process
  3. Wait for CDC to process up to that offset - CDC API has an API to retrieve binlog position - at this point because the MESSAGE table is locked there are no more events to be published.
  4. Publish all messages directly to Kafka
  5. Unlock tables
@dartartem
Copy link
Contributor

I will research

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants