-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Avoid creation of workflows with non-empty tables in target keyspace #16874
fix: Avoid creation of workflows with non-empty tables in target keyspace #16874
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
…pace Signed-off-by: Noble Mittal <[email protected]>
45e3cba
to
305bfd7
Compare
Signed-off-by: Noble Mittal <[email protected]>
Signed-off-by: Noble Mittal <[email protected]>
Signed-off-by: Noble Mittal <[email protected]>
305bfd7
to
1f30481
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16874 +/- ##
==========================================
+ Coverage 69.34% 69.36% +0.02%
==========================================
Files 1571 1571
Lines 204179 204221 +42
==========================================
+ Hits 141586 141668 +82
+ Misses 62593 62553 -40 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Can you address the comments and also update the description with more detail on how you have implemented the PR.
@@ -480,8 +480,8 @@ func (tmc *fakeTMClient) VReplicationExec(ctx context.Context, tablet *topodatap | |||
} | |||
for qry, res := range tmc.vreQueries[int(tablet.Alias.Uid)] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you need to make this change from regexp.MustCompile(qry)
to regexp.MustCompile(qry[1:])
?
It is not correct afaik: this is supposed to allow regexp queries, so the /
at the beginning should be required.
Your second fix for matching query
instead of qry
is a good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to remove the /
at the beginning, for regexp
in go, we don't need that for regexp matching. This was probably a mistake. We already have similar logic in our codebase: e.g. https://github.com/vitessio/vitess/blob/main/go/vt/vtctl/workflow/framework_test.go#L450-L463
Signed-off-by: Noble Mittal <[email protected]>
Signed-off-by: Noble Mittal <[email protected]>
Signed-off-by: Noble Mittal <[email protected]>
go/vt/vtctl/workflow/utils.go
Outdated
// containing a list of non-empty tables. | ||
func validateEmptyTables(ctx context.Context, ts *topo.Server, tmc tmclient.TabletManagerClient, shards []*topo.ShardInfo, tableSettings []*vtctldatapb.TableMaterializeSettings) error { | ||
var mu sync.Mutex | ||
isTableFaulty := map[string]bool{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change terminology from "Faulty" to "NonEmpty" like hasTableData
or isNonEmpty
. Similarly faultyTables
=> nonEmptyTables
or tablesWithData
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Signed-off-by: Noble Mittal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, @beingnoble03 ! I only had minor nits and suggestions, can you huddle up with @rohit-nayak-ps and decide what changes you'd like to make from my comments? They are not correctness issues so I will leave it to the two of you to decide.
This is the only one that I really think we should probably do: #16874 (comment)
Thanks! ❤️
@@ -290,6 +290,15 @@ func (mz *materializer) deploySchema() error { | |||
} | |||
} | |||
|
|||
// Check if any table being moved is already non-empty in the target keyspace. | |||
// Skip this check for multi-tenant migrations. | |||
if !mz.IsMultiTenantMigration() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nice future addition would be to check for any existing tenant data in each table on the target too.
go/vt/vtctl/workflow/materializer.go
Outdated
// Check if any table being moved is already non-empty in the target keyspace. | ||
// Skip this check for multi-tenant migrations. | ||
if !mz.IsMultiTenantMigration() { | ||
err := validateEmptyTables(mz.ctx, mz.ts, mz.tmc, mz.targetShards, mz.ms.TableSettings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're passing in a lot of mz members, which tells me it's probably better to have mz as the method receiver, no?
mz.validateEmptyTables()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this too, but went with the other approach anyway. Refactored the function with the receiver, it looks much cleaner now!
Also, if the method receiver is materializer
, it makes sense to have validateEmptyTables
in materializer.go
. So, moved it from utils.go
.
go/vt/vtctl/workflow/utils.go
Outdated
eg, groupCtx := errgroup.WithContext(ctx) | ||
eg.SetLimit(20) | ||
|
||
for _, ts := range tableSettings { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth adding the shard and table to the errors in this loop as there could be 1,000 shards and you would want to know what specific shard had a table with data in it, and e.g. what shard and table we encountered an error in processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do this in a follow-up PR?
Signed-off-by: Noble Mittal <[email protected]>
Signed-off-by: Noble Mittal <[email protected]>
Signed-off-by: Noble Mittal <[email protected]>
Description
This PR adds a validation check for empty tables in the target keyspace while creating workflows, in
deploySchema()
. Addresses all the comments in #16826.Related Issue(s)
Screenshots
Checklist
Deployment Notes