Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdms: Choose a suitable pdms to transfer primary when upgrade (#5643) #5709

Merged

Conversation

ti-chi-bot
Copy link
Member

This is an automated cherry-pick of #5643

What problem does this PR solve?

Ref #1235, Ref tikv/pd#8157

What is changed and how does it work?

summary

Let's assume there are three tso nodes scheduling-0, scheduling-1, scheduling-2.
tidb-operator will upgrade them in the order 2->0.
If scheduling-1 is primary, it is possible that when upgrading scheduling-1, the primary will be transferred to scheduling-0, and then the primary will be transferred again when upgrading scheduling-0.

  • This pr ensures that when scheduling-1 is upgraded, the primary is transferred to scheduling-2, reducing the number of transfers.

Using API

When I created 3 scheduling pods with 8.3.0 PD version

$ kubectl exec -it basic-pd-0 -n pingcap sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
sh-5.1# curl --location --request GET 'http://127.0.0.1:2379/pd/api/v2/ms/members/scheduling'
[
    {
        "name": "basic-scheduling-0",
        "service-addr": "http://basic-scheduling-0.basic-scheduling-peer.pingcap.svc:2379",
        "version": "v8.3.0",
        "git-hash": "2d9a3b0e5da1a8e50251c4510368e5b3085394c7",
        "deploy-path": "/",
        "start-timestamp": 1723535895
    },
    {
        "name": "basic-scheduling-1",
        "service-addr": "http://basic-scheduling-1.basic-scheduling-peer.pingcap.svc:2379",
        "version": "v8.3.0",
        "git-hash": "2d9a3b0e5da1a8e50251c4510368e5b3085394c7",
        "deploy-path": "/",
        "start-timestamp": 1723535883
    },
    {
        "name": "basic-scheduling-2",
        "service-addr": "http://basic-scheduling-2.basic-scheduling-peer.pingcap.svc:2379",
        "version": "v8.3.0",
        "git-hash": "2d9a3b0e5da1a8e50251c4510368e5b3085394c7",
        "deploy-path": "/",
        "start-timestamp": 1723535831
    }
]

// get current leader which is `scheduling-1`
sh-5.1# curl --location --request GET 'http://127.0.0.1:2379/pd/api/v2/ms/primary/scheduling'
"http://basic-scheduling-1.basic-scheduling-peer.pingcap.svc:2379"

// we need to login `scheduling-1` machine
// and then transfer primary to `scheduling-2`
$ kubectl exec -it basic-scheduling-1 -n pingcap sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
sh-5.1# curl --location --request POST 'http://127.0.0.1:2379/scheduling/api/v1/primary/transfer' \
--header 'Content-Type: application/json' \
--data-raw '{
    "new_primary": "basic-scheduling-2"
}'
"success"

// get current leader which is `scheduling-2`
$ kubectl exec -it basic-pd-0 -n pingcap sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
sh-5.1# curl --location --request GET 'http://127.0.0.1:2379/pd/api/v2/ms/primary/scheduling'
"http://basic-scheduling-2.basic-scheduling-peer.pingcap.svc:2379"

check log

Let's upgrade 3 scheduling, and primary is scheduling-2 now.

// when `scheduling-2` is primary, should transfer to `scheduling-0`
I0813 07:57:08.289779       1 pd_ms_upgrader.go:144] TidbCluster: [pingcap/basic]' pdms upgrader: check primary: http://basic-scheduling-2.basic-scheduling-peer.pingcap.svc:2379, upgradePDMSName: basic-scheduling-2, upgradePodName: basic-scheduling-2
I0813 07:57:08.289801       1 pd_ms_upgrader.go:180] Tidbcluster: [pingcap/basic]' pdms upgrader: start to choose pdms to transfer primary from members
I0813 07:57:08.289815       1 pd_ms_upgrader.go:205] Tidbcluster: [pingcap/basic]' pdms upgrader: choose pdms to transfer primary from members, targetName: basic-scheduling-0
I0813 07:57:08.289820       1 pd_ms_upgrader.go:155] TidbCluster: [pingcap/basic]' pdms upgrader: transfer pdms primary to: basic-scheduling-0
E0813 07:57:08.289834       1 pdms_api.go:67] only support TSO service, but got scheduling
I0813 07:57:08.296517       1 pd_ms_upgrader.go:161] TidbCluster: [pingcap/basic]' pdms upgrader: transfer pdms primary to: basic-scheduling-0 successfully


// `scheduling-1` will upgraded directly which the primary is `scheduling-0`
I0813 07:57:57.924827       1 pd_ms_upgrader.go:144] TidbCluster: [pingcap/basic]' pdms upgrader: check primary: http://basic-scheduling-0.basic-scheduling-peer.pingcap.svc:2379, upgradePDMSName: basic-scheduling-1, upgradePodName: basic-scheduling-1

// when upgrade `scheduling-0`, should transfer to `scheduling-2` because `scheduling-0` is primary now.
I0813 07:58:05.912621       1 statefulset.go:182] set pingcap/basic-scheduling partition to 1
I0813 07:58:05.912978       1 pd_ms_upgrader.go:144] TidbCluster: [pingcap/basic]' pdms upgrader: check primary: http://basic-scheduling-0.basic-scheduling-peer.pingcap.svc:2379, upgradePDMSName: basic-scheduling-0, upgradePodName: basic-scheduling-0
I0813 07:58:05.912998       1 pd_ms_upgrader.go:180] Tidbcluster: [pingcap/basic]' pdms upgrader: start to choose pdms to transfer primary from members
I0813 07:58:05.913011       1 pd_ms_upgrader.go:205] Tidbcluster: [pingcap/basic]' pdms upgrader: choose pdms to transfer primary from members, targetName: basic-scheduling-2
I0813 07:58:05.913022       1 pd_ms_upgrader.go:155] TidbCluster: [pingcap/basic]' pdms upgrader: transfer pdms primary to: basic-scheduling-2
E0813 07:58:05.913040       1 pdms_api.go:67] only support TSO service, but got scheduling
I0813 07:58:05.919682       1 pd_ms_upgrader.go:161] TidbCluster: [pingcap/basic]' pdms upgrader: transfer pdms primary to: basic-scheduling-2 successfully

Code changes

  • Has Go code change

Tests

  • Unit test
  • Manual test
  • No code

Side effects

  • Breaking backward compatibility
  • Other side effects:

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release Notes

Please refer to Release Notes Language Style Guide before writing the release note.


Signed-off-by: husharp <[email protected]>
Copy link
Contributor

ti-chi-bot bot commented Aug 14, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hanlins for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot requested a review from shonge August 14, 2024 02:42
@csuzhangxc csuzhangxc merged commit 15de814 into pingcap:release-1.6 Aug 14, 2024
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants