Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

placement-rule: it can't transfer leader to the target store #7992

Closed
TonsnakeLin opened this issue Mar 27, 2024 · 12 comments · Fixed by #8010
Closed

placement-rule: it can't transfer leader to the target store #7992

TonsnakeLin opened this issue Mar 27, 2024 · 12 comments · Fixed by #8010
Assignees
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.

Comments

@TonsnakeLin
Copy link
Contributor

Bug Report

What did you do?

There are two data centers, one is dc1 and the other is dc2. At first we forced the leader at dc2 by placement-rule , and we switched the leader to dc1 by another placement-rule. But it left two region leader in dc2 at last.

What did you expect to see?

All region leaders were transfered to dc2.

What did you see instead?

There are two region leaders left in dc2 at last.

What version of PD are you using (pd-server -V)?

Release Version: v6.5.4-20231107-65de8fc

@TonsnakeLin TonsnakeLin added the type/bug The issue is confirmed as a bug. label Mar 27, 2024
@TonsnakeLin
Copy link
Contributor Author

/assign @TonsnakeLin

@TonsnakeLin
Copy link
Contributor Author

It is the root cause of this issue bellow.
The transfer-leader operator wasn't gernerated for the two region, and the leaders were not switched to dc1 at last.
Why was the operator not generated? Because there is a merge-region operator was executing and one region only has one operator at the same time.
Why the merge operator was not finished? The merge-region for region A and B operator was gernerated at pd-server successfuly, but it run failed at tikv-server becase the region A was splitted and it's key range has been changed. Unfortunately, the timeout value of this merge-region opeator is 1h8m, it can not be finished or canceld.

There are two unresonable points.

  1. The timeout value of the merge-region operator is too long.
  2. The merge-region should detect the region meta changed when region A was splitted.

@TonsnakeLin
Copy link
Contributor Author

/remove-label may-affects-7.1

@TonsnakeLin
Copy link
Contributor Author

/remove-label may-affects-5.4, may-affects-6.1, may-affects-7.5

@TonsnakeLin
Copy link
Contributor Author

/remove-label may-affects-5.4

@TonsnakeLin
Copy link
Contributor Author

/remove-label may-affects-6.1

@TonsnakeLin
Copy link
Contributor Author

/remove-label may-affects-7.5

@TonsnakeLin
Copy link
Contributor Author

/label affects-5.4

@ti-chi-bot ti-chi-bot bot added the affects-5.4 This bug affects the 5.4.x(LTS) versions. label Apr 1, 2024
@TonsnakeLin
Copy link
Contributor Author

/label affects-6.1

@ti-chi-bot ti-chi-bot bot added the affects-6.1 This bug affects the 6.1.x(LTS) versions. label Apr 1, 2024
@TonsnakeLin
Copy link
Contributor Author

/label affects-7.1

@ti-chi-bot ti-chi-bot bot added the affects-7.1 This bug affects the 7.1.x(LTS) versions. label Apr 1, 2024
@TonsnakeLin
Copy link
Contributor Author

/label affects-7.5

@ti-chi-bot ti-chi-bot bot added the affects-7.5 This bug affects the 7.5.x(LTS) versions. label Apr 1, 2024
@ti-chi-bot ti-chi-bot bot closed this as completed in #8010 Apr 3, 2024
ti-chi-bot bot added a commit that referenced this issue Apr 3, 2024
#8010)

close #7992

Signed-off-by: TonsnakeLin <[email protected]>

Co-authored-by: TonsnakeLin <[email protected]>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Apr 15, 2024
ti-chi-bot bot pushed a commit that referenced this issue Apr 17, 2024
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue May 13, 2024
TonsnakeLin added a commit to TonsnakeLin/pd that referenced this issue May 15, 2024
ti-chi-bot bot pushed a commit that referenced this issue May 17, 2024
ti-chi-bot bot pushed a commit that referenced this issue May 20, 2024
@seiya-annie
Copy link

/found customer

@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.
Projects
Development

Successfully merging a pull request may close this issue.

4 participants