correct where to increment the clusterNotFound count and adjust quarantine log level #950

bohhyang · 2023-12-06T21:34:43Z

Summary
Currently the clusterNotFound count is incremented when there is no Uris in the cluster in addition to when the cluster config is not found. This change moves it to a place which only increment for cluster config not found.

Also, changing the log level about long-quarantined hosts to Info instead of Error, because 1) often it's just a temporary small capacity which's not causing availability issue. 2) it floods the upstream app's log triggering alerts and/or creating EXC tickets that the upstream app owner is not responsible for. Clarified the log message to show the issue is the downstream host's health.

Test done
build and test

…ntine log level

brycezhongqing

LGTM

correct where to increment the clusterNotFound count and adjust quara…

7752d14

…ntine log level

bohhyang requested review from nizarm, PapaCharlie, shivamgupta1 and brycezhongqing December 6, 2023 21:35

brycezhongqing approved these changes Dec 6, 2023

View reviewed changes

bohhyang merged commit e949cf1 into master Dec 7, 2023
2 checks passed

bohhyang deleted the bohan/clusterMetric branch December 7, 2023 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

correct where to increment the clusterNotFound count and adjust quarantine log level #950

correct where to increment the clusterNotFound count and adjust quarantine log level #950

bohhyang commented Dec 6, 2023 •

edited

Loading

brycezhongqing left a comment

correct where to increment the clusterNotFound count and adjust quarantine log level #950

correct where to increment the clusterNotFound count and adjust quarantine log level #950

Conversation

bohhyang commented Dec 6, 2023 • edited Loading

brycezhongqing left a comment

Choose a reason for hiding this comment

bohhyang commented Dec 6, 2023 •

edited

Loading