Skip to content

Commit

Permalink
agent: don't treat downscale denies as failed request (#927)
Browse files Browse the repository at this point in the history
Fixes #926, a follow-up for the #770.

The defintion for the VM stuckness was changed to include denied
downscale request in the following commit:

    commit fdf0133
    Author: Shayan Hosseini <[email protected]>
    Date:   Sat Apr 6 09:25:01 2024 -0400

    agent: track more liveness in vm-stuck metrics (#855)

This resulted in the consistent firing of the alert.

We should actually treat the denied downscale as part of the normal
operation. This can happen due to mismatching policy of what is an
acceptable level memory usage in autoscaler-agent vs vm_monitor.

Signed-off-by: Oleg Vasilev <[email protected]>
  • Loading branch information
Omrigan authored May 7, 2024
1 parent 56a38a4 commit 669bb52
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions pkg/agent/execbridge.go
Original file line number Diff line number Diff line change
Expand Up @@ -169,8 +169,10 @@ func (h *execMonitorHandle) Downscale(

result, err := doMonitorDownscale(ctx, logger, h.monitor.dispatcher, target)

if err == nil && result.Ok {
h.runner.recordResourceChange(current, target, h.runner.global.metrics.monitorApprovedChange)
if err == nil {
if result.Ok {
h.runner.recordResourceChange(current, target, h.runner.global.metrics.monitorApprovedChange)
}
} else {
h.runner.status.update(h.runner.global, func(ps podStatus) podStatus {
ps.failedMonitorRequestCounter.Inc()
Expand Down

0 comments on commit 669bb52

Please sign in to comment.