Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TiCDC MySQL DDLSink] reliable async add index when downstream is not TiDB #10267

Open
Tracked by #10343
zhangjinpeng87 opened this issue Dec 7, 2023 · 0 comments
Open
Tracked by #10343
Labels
type/feature Issues about a new feature

Comments

@zhangjinpeng87
Copy link
Contributor

zhangjinpeng87 commented Dec 7, 2023

Is your feature request related to a problem?

Before #9701 all DDL event will block DML events and result a huge replication lag when TiCDC received a DDL event and execute it downstream and waiting for the DDL accomplish in downstream (MySQL/TiDB/Aurora). #9701 introduced asyncAddIndex if only the downstream is TiDB, asyncAddIndex use an asynchronous go routine to execute the ADD Index operations with retries. TiDB can works because TiDB will save DDL jobs and execute jobs even TiCDC crashed. There are two issues current TiCDC when replicate add index DDL event:

  • If the downstream is MySQL, TiCDC's replication will be blocked by such time consuming adding index DDL event, and result in a large replication lag. https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html MySQL also support online ddl which means it has the possibility to run DMLs during adding index to reduce the replication lag.
  • Even the downstream TiDB, if asynchronously submitting the add index event to TiDB encountered some errors, and the checkpoint is forwarded, this is a risk of losing this add index DDL event if the TiCDC crashed at that time.

In disaster recovery scenario, many users prefer to use MySQL/Aurora as the secondary database and using TiCDC to replicate data changes to these secondary database. In this scenario, the replication lag matters. If there is an add index DDL event, it will result in large replication lag which is not acceptable for this disaster recovery scenario.

Describe the feature you'd like

Introduce a reliable mechanism to persistent and execute add index and similar time consuming DDL events asynchronously, no matter the downstream is TiDB, or MySQL/Aurora MySQL. To eliminate the replication lag issue when there is an adding index event, and eliminate the risk of losing add index DDL event in some error cases.

Describe alternatives you've considered

No response

Teachability, Documentation, Adoption, Migration Strategy

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Issues about a new feature
Projects
None yet
Development

No branches or pull requests

1 participant