Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent/goalcu: implement ZoneLoadAverage #1148

Merged
merged 7 commits into from
Nov 27, 2024
Merged

agent/goalcu: implement ZoneLoadAverage #1148

merged 7 commits into from
Nov 27, 2024

Conversation

Omrigan
Copy link
Contributor

@Omrigan Omrigan commented Nov 18, 2024

ZoneLoadAverage compares abs(load1 - load5) against two thresholds, to determine if compute is in one of 3 zones:

  • "stable zone", where load5 is used to inform the target CU value
  • "scaling zone", where load1 is used to inform the target CU value
  • "mixed zone", where load1 and load5 is blended to get an intermediate value

The thresholds are computed as a percentage of a load5, the percentage is set by two new values: cpuStableZoneRatio and cpuMixedZoneRatio.

For example, for the cpuStableZoneRatio = 0.25, cpuMixedZoneRatio = 0.25 and load5 = 8, we'd get:

  • stable zone when abs(load5 - load1) < 2 <=> 6 < load1 < 10
  • mixed zone when 2 < abs(load5 - load1) < 4 <=> 4 < load1 < 6 or 10 < load1 < 12.
  • scaling zone otherwise

Similar algorithm is described in https://www.notion.so/neondatabase/RFC-SwitchLoadAverage-13ef189e0047804f9297e8772739508e?pvs=4

Fixes #729

Copy link

github-actions bot commented Nov 18, 2024

No changes to the coverage.

HTML Report

Click to open

@Omrigan Omrigan changed the title Implement ZoneLoadAverage agent/goalcu: implement ZoneLoadAverage Nov 18, 2024
Signed-off-by: Oleg Vasilev <[email protected]>
Signed-off-by: Oleg Vasilev <[email protected]>
Signed-off-by: Oleg Vasilev <[email protected]>
@Omrigan Omrigan marked this pull request as ready for review November 25, 2024 21:01
Copy link
Member

@sharnoff sharnoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, awesome work on this 🚀

Left a few small comments; feel free to merge once addressed.

pkg/agent/core/goalcu.go Outdated Show resolved Hide resolved
pkg/agent/core/goalcu.go Show resolved Hide resolved
pkg/agent/core/goalcu.go Show resolved Hide resolved
autoscaler-agent/config_map.yaml Show resolved Hide resolved
Omrigan and others added 2 commits November 26, 2024 13:17
Signed-off-by: Oleg Vasilev <[email protected]>
@Omrigan Omrigan enabled auto-merge (squash) November 26, 2024 17:46
@Omrigan Omrigan merged commit cf514ed into main Nov 27, 2024
22 checks passed
@Omrigan Omrigan deleted the oleg/zone-la branch November 27, 2024 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: autoscaler-agent scaling algorithm is too volatile for larger computes
2 participants