-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add an Arrow-based, columnar binlog buffer #91
Conversation
link: vitess.io/vitess/go/hack: invalid reference to runtime.roundupsize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! This PR lays the foundation for seamless OLTP synchronization, making MyDuckServer as easy to use as Apple products. Cheers to Delta!
Thanks for your detailed review! The next step would be letting the delta appender buffer as much data as possible until 1) the memory usage becomes too big; 2) some of the tables are queried; 3) a pre-defined time ticker (e.g., every 1 minute) is fired. |
sounds great! |
This PR introduces a columnar delta buffer to store the DML change logs replicated from the primary server. The change logs are transmitted via the ROWS_EVENT binlog events, which include INSERT, UPDATE, and DELETE operations.
This PR is the first step towards resolving issues #55 and #56. Currently, the buffer is flushed to DuckDB for each binlog event, which is not the intended usage. The primary goal of this PR is to ensure that the new components function as expected and pass the existing tests. I will improve the scheduling of buffer flushing in forthcoming PRs.
To enhance the performance of the columnar delta buffer, the following changes have been made:
binlog/rbr.go
). This modification minimizes copying and allocation, which is crucial for high performance.replica/controller.go
). This is key to avoiding issue Row-by-Row INSERTs are very slow in DuckDB #55. This implementation can be further optimized to a zero-copy approach once PR #283 is merged.dolthub/vitess
to the officialvitess.io/vitess
for binlog handling. The official repository has been refactored since the DoltHub fork and thus handles JSON data better and is easier to use. Since our project depends onvitess.io/vitess
anddolthub/go-mysql-server
, and the latter depends ondolthub/vitess
, I have forkedgo-mysql-server
anddolthub/vitess
to our organization to resolve the conflicts.This PR has passed all existing binlog replication tests.