scheduling service recover more than 5mins when inject scheduling primary network partition #7854

Lily2025 · 2024-02-28T05:31:06Z

Enhancement

What did you do?

1、run workload
2、inject network partition between scheduling primary and all other pods

What did you expect to see?

scheduling service can recover less than 5mins when inject scheduling primary network partition

What did you see instead?

scheduling service recover more than 5mins when inject scheduling primary network partition

What version of PD are you using (`pd-server -V`)?

./pd-server -V
Release Version: v8.0.0-alpha
Edition: Community
Git Commit Hash: e199866
Git Branch: heads/refs/tags/v8.0.0-alpha
UTC Build Time: 2024-02-26 11:38:17
2024-02-28T11:55:27.776+0800

The text was updated successfully, but these errors were encountered:

Lily2025 · 2024-02-28T05:32:05Z

/assign rleungx

rleungx · 2024-02-28T07:42:29Z

It relies on hibernate region tick interval because currently, the switch of scheduling primary won't awake all regions. So the prepare checker cannot receive all regions' heartbeat in time.

Lily2025 added the type/bug The issue is confirmed as a bug. label Feb 28, 2024

ti-chi-bot bot assigned rleungx Feb 28, 2024

rleungx added type/enhancement The issue or PR belongs to an enhancement. and removed type/bug The issue is confirmed as a bug. labels Feb 28, 2024

rleungx assigned lhy1024 and unassigned rleungx Mar 5, 2024

github-project-automation bot added this to Questions and Bug Reports Aug 29, 2024

github-project-automation bot moved this to Need Triage in Questions and Bug Reports Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scheduling service recover more than 5mins when inject scheduling primary network partition #7854

scheduling service recover more than 5mins when inject scheduling primary network partition #7854

Lily2025 commented Feb 28, 2024 •

edited by rleungx

Loading

Lily2025 commented Feb 28, 2024

rleungx commented Feb 28, 2024

scheduling service recover more than 5mins when inject scheduling primary network partition #7854

scheduling service recover more than 5mins when inject scheduling primary network partition #7854

Comments

Lily2025 commented Feb 28, 2024 • edited by rleungx Loading

Enhancement

What did you do?

What did you expect to see?

What did you see instead?

What version of PD are you using (pd-server -V)?

Lily2025 commented Feb 28, 2024

rleungx commented Feb 28, 2024

Lily2025 commented Feb 28, 2024 •

edited by rleungx

Loading

What version of PD are you using (`pd-server -V`)?