Skip to content

Commit

Permalink
chore(charts/app): remove some bad alerts (#254)
Browse files Browse the repository at this point in the history
Co-authored-by: Matt Wise <[email protected]>
  • Loading branch information
diranged and diranged authored Dec 15, 2023
1 parent ed34754 commit 733b23d
Show file tree
Hide file tree
Showing 12 changed files with 6 additions and 141 deletions.
2 changes: 1 addition & 1 deletion charts/prometheus-alerts/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: prometheus-alerts
description: Helm Chart that provisions a series of common Prometheus Alerts
type: application
version: 1.3.1
version: 1.4.0
appVersion: 0.0.1
maintainers:
- name: diranged
Expand Down
4 changes: 1 addition & 3 deletions charts/prometheus-alerts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

Helm Chart that provisions a series of common Prometheus Alerts

![Version: 1.3.1](https://img.shields.io/badge/Version-1.3.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.0.1](https://img.shields.io/badge/AppVersion-0.0.1-informational?style=flat-square)
![Version: 1.4.0](https://img.shields.io/badge/Version-1.4.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.0.1](https://img.shields.io/badge/AppVersion-0.0.1-informational?style=flat-square)

[deployments]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
[hpa]: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
Expand Down Expand Up @@ -78,8 +78,6 @@ This behavior can be tuned via the `defaults.podNameSelector`,
| containerRules.KubeDaemonSetRolloutStuck.for | string | `"15m"` | |
| containerRules.KubeDaemonSetRolloutStuck.severity | string | `"warning"` | |
| containerRules.KubeDeploymentGenerationMismatch | object | `{"for":"15m","severity":"warning"}` | Deployment generation mismatch due to possible roll-back |
| containerRules.KubeDeploymentReplicasMismatch.for | string | `"15m"` | |
| containerRules.KubeDeploymentReplicasMismatch.severity | string | `"warning"` | |
| containerRules.KubeHpaMaxedOut.for | string | `"15m"` | |
| containerRules.KubeHpaMaxedOut.severity | string | `"warning"` | |
| containerRules.KubeHpaReplicasMismatch.for | string | `"15m"` | |
Expand Down
27 changes: 0 additions & 27 deletions charts/prometheus-alerts/templates/containers-prometheusrule.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -215,33 +215,6 @@ spec:
{{- end }}
{{- end }}

{{ with .Values.containerRules.KubeDeploymentReplicasMismatch -}}
- alert: KubeDeploymentReplicasMismatch
annotations:
summary: Deployment has not matched the expected number of replicas.
runbook_url: {{ $values.defaults.runbookUrl }}#alert-name-kubedeploymentreplicasmismatch
description: >-
Deployment {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}}
$labels.deployment {{`}}`}} has not matched the expected number of
replicas for longer than {{ .for }}.
expr: |-
(
kube_deployment_spec_replicas{ {{- $deploymentSelector -}} }
!=
kube_deployment_status_replicas_available{ {{- $deploymentSelector -}} }
) and (
changes(kube_deployment_status_replicas_updated{ {{- $deploymentSelector -}} }[5m])
==
0
)
for: {{ .for }}
labels:
severity: {{ .severity }}
{{- if $values.defaults.additionalRuleLabels }}
{{ toYaml $values.defaults.additionalRuleLabels | nindent 8 }}
{{- end }}
{{- end }}

{{ with .Values.containerRules.KubeStatefulSetReplicasMismatch -}}
- alert: KubeStatefulSetReplicasMismatch
annotations:
Expand Down
5 changes: 0 additions & 5 deletions charts/prometheus-alerts/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -170,11 +170,6 @@ containerRules:
severity: warning
for: 15m

# Deployment has not matched the expected number of replicas
KubeDeploymentReplicasMismatch:
severity: warning
for: 15m

# Deployment has not matched the expected number of replicas
KubeStatefulSetReplicasMismatch:
severity: warning
Expand Down
2 changes: 1 addition & 1 deletion charts/rollout-app/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: rollout-app
description: Argo Rollout-based Application Helm Chart
type: application
version: 0.6.0
version: 0.7.0
appVersion: latest
maintainers:
- name: diranged
Expand Down
4 changes: 1 addition & 3 deletions charts/rollout-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Argo Rollout-based Application Helm Chart

![Version: 0.6.0](https://img.shields.io/badge/Version-0.6.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: latest](https://img.shields.io/badge/AppVersion-latest-informational?style=flat-square)
![Version: 0.7.0](https://img.shields.io/badge/Version-0.7.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: latest](https://img.shields.io/badge/AppVersion-latest-informational?style=flat-square)

[analysistemplate]: https://argoproj.github.io/argo-rollouts/features/analysis/?query=AnalysisTemplate#background-analysis
[argo_rollouts]: https://argoproj.github.io/argo-rollouts/
Expand Down Expand Up @@ -271,8 +271,6 @@ kmsSecretsRegion: us-west-2 (AWS region where the KMS key is located)
| progressDeadlineSeconds | string | `nil` | https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#progress-deadline-seconds |
| prometheusRules.CPUThrottlingHigh | object | `{"for":"15m","severity":"warning","threshold":5}` | Container is being throttled by the CGroup - needs more resources. This value is appropriate for applications that are highly sensitive to request latency. Insensitive workloads might need to raise this percentage to avoid alert noise. |
| prometheusRules.ContainerWaiting | object | `{"for":"1h","severity":"warning"}` | Pod container waiting longer than threshold |
| prometheusRules.DeploymentGenerationMismatch | object | `{"for":"15m","severity":"warning"}` | Deployment generation mismatch due to possible roll-back |
| prometheusRules.DeploymentReplicasMismatch | object | `{"for":"15m","severity":"warning"}` | Deployment has not matched the expected number of replicas |
| prometheusRules.HpaMaxedOut | object | `{"for":"15m","severity":"warning"}` | HPA is running at max replicas |
| prometheusRules.HpaReplicasMismatch | object | `{"for":"15m","severity":"warning"}` | HPA has not matched descired number of replicas |
| prometheusRules.PodContainerTerminated | object | `{"for":"1m","over":"10m","reasons":["ContainerCannotRun","DeadlineExceeded"],"severity":"warning","threshold":0}` | Monitors Pods for Containers that are terminated either for unexpected reasons like ContainerCannotRun. If that number breaches the $threshold (1) for $for (1m), then it will alert. |
Expand Down
53 changes: 0 additions & 53 deletions charts/rollout-app/templates/prometheusrules.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -163,59 +163,6 @@ spec:
- name: {{ .Release.Namespace }}.{{ .Release.Name }}.{{ .Chart.Name }}.DeploymentRules
rules:

{{- with .Values.prometheusRules.DeploymentGenerationMismatch }}
- alert: DeploymentGenerationMismatch
annotations:
summary: >-
{{`{{ $labels.deployment }}`}} deployment generation mismatch due to possible roll-back
runbook_url: {{ $runbookUrl }}#deploymentgenerationmismatch
description: >-
Deployment generation for {{`{{ $labels.namespace }}`}}/{{`{{ $labels.deployment }}`}}
does not match, this indicates that the Deployment has failed but has not been rolled
back.
expr: |-
sum by(namespace, deployment) (
kube_deployment_status_observed_generation{job="kube-state-metrics", namespace=~"{{ $targetNamespace }}", deployment="{{ $appName }}"}
!=
kube_deployment_metadata_generation{job="kube-state-metrics", namespace=~"{{ $targetNamespace }}", deployment="{{ $appName }}"}
)
for: {{ .for }}
labels:
severity: {{ .severity }}
{{- with $values.prometheusRules.additionalRuleLabels }}
{{ toYaml . | nindent 8 }}
{{- end }}
{{- end }}

{{- with .Values.prometheusRules.DeploymentReplicasMismatch }}
- alert: DeploymentReplicasMismatch
annotations:
summary: >-
{{`{{ $labels.deployment }}`}} deployment has not matched the expected number of replicas.
runbook_url: {{ $runbookUrl }}#deploymentreplicasmismatch
description: >-
Deployment {{`{{ $labels.namespace }}`}}/{{`{{ $labels.deployment }}`}}
has not matched the expected number of replicas for longer than {{ .for }}.
expr: |-
sum by(namespace, deployment) (
(
kube_deployment_spec_replicas{job="kube-state-metrics", namespace=~"{{ $targetNamespace }}", deployment="{{ $appName }}"}
!=
kube_deployment_status_replicas_available{job="kube-state-metrics", namespace=~"{{ $targetNamespace }}", deployment="{{ $appName }}"}
) and (
changes(kube_deployment_status_replicas_updated{job="kube-state-metrics", namespace=~"{{ $targetNamespace }}", deployment="{{ $appName }}"}[5m])
==
0
)
)
for: {{ .for }}
labels:
severity: {{ .severity }}
{{- with $values.prometheusRules.additionalRuleLabels }}
{{ toYaml . | nindent 8 }}
{{- end }}
{{- end }}

{{- if .Values.autoscaling.enabled }}
- name: {{ .Release.Namespace }}.{{ .Release.Name }}.{{ .Chart.Name }}.HorizontalPodAutoscalerRules
rules:
Expand Down
10 changes: 0 additions & 10 deletions charts/rollout-app/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -914,16 +914,6 @@ prometheusRules:
severity: warning
for: 1h

# -- Deployment generation mismatch due to possible roll-back
DeploymentGenerationMismatch:
severity: warning
for: 15m

# -- Deployment has not matched the expected number of replicas
DeploymentReplicasMismatch:
severity: warning
for: 15m

# -- HPA has not matched descired number of replicas
HpaReplicasMismatch:
severity: warning
Expand Down
2 changes: 1 addition & 1 deletion charts/simple-app/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: simple-app
description: Default Microservice Helm Chart
type: application
version: 1.5.0
version: 1.6.0
appVersion: latest
maintainers:
- name: diranged
Expand Down
3 changes: 1 addition & 2 deletions charts/simple-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Default Microservice Helm Chart

![Version: 1.5.0](https://img.shields.io/badge/Version-1.5.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: latest](https://img.shields.io/badge/AppVersion-latest-informational?style=flat-square)
![Version: 1.6.0](https://img.shields.io/badge/Version-1.6.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: latest](https://img.shields.io/badge/AppVersion-latest-informational?style=flat-square)

[deployments]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
[hpa]: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
Expand Down Expand Up @@ -402,7 +402,6 @@ kmsSecretsRegion: us-west-2 (AWS region where the KMS key is located)
| prometheusRules.CPUThrottlingHigh | object | `{"for":"15m","severity":"warning","threshold":5}` | Container is being throttled by the CGroup - needs more resources. This value is appropriate for applications that are highly sensitive to request latency. Insensitive workloads might need to raise this percentage to avoid alert noise. |
| prometheusRules.ContainerWaiting | object | `{"for":"1h","severity":"warning"}` | Pod container waiting longer than threshold |
| prometheusRules.DeploymentGenerationMismatch | object | `{"for":"15m","severity":"warning"}` | Deployment generation mismatch due to possible roll-back |
| prometheusRules.DeploymentReplicasMismatch | object | `{"for":"15m","severity":"warning"}` | Deployment has not matched the expected number of replicas |
| prometheusRules.HpaMaxedOut | object | `{"for":"15m","severity":"warning"}` | HPA is running at max replicas |
| prometheusRules.HpaReplicasMismatch | object | `{"for":"15m","severity":"warning"}` | HPA has not matched descired number of replicas |
| prometheusRules.PodContainerTerminated | object | `{"for":"1m","over":"10m","reasons":["ContainerCannotRun","DeadlineExceeded"],"severity":"warning","threshold":0}` | Monitors Pods for Containers that are terminated either for unexpected reasons like ContainerCannotRun. If that number breaches the $threshold (1) for $for (1m), then it will alert. |
Expand Down
30 changes: 0 additions & 30 deletions charts/simple-app/templates/prometheusrules.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -187,35 +187,6 @@ spec:
{{- end }}
{{- end }}

{{- with .Values.prometheusRules.DeploymentReplicasMismatch }}
- alert: DeploymentReplicasMismatch
annotations:
summary: >-
{{`{{ $labels.deployment }}`}} deployment has not matched the expected number of replicas.
runbook_url: {{ $runbookUrl }}#deploymentreplicasmismatch
description: >-
Deployment {{`{{ $labels.namespace }}`}}/{{`{{ $labels.deployment }}`}}
has not matched the expected number of replicas for longer than {{ .for }}.
expr: |-
sum by(namespace, deployment) (
(
kube_deployment_spec_replicas{job="kube-state-metrics", namespace=~"{{ $targetNamespace }}", deployment="{{ $appName }}"}
!=
kube_deployment_status_replicas_available{job="kube-state-metrics", namespace=~"{{ $targetNamespace }}", deployment="{{ $appName }}"}
) and (
changes(kube_deployment_status_replicas_updated{job="kube-state-metrics", namespace=~"{{ $targetNamespace }}", deployment="{{ $appName }}"}[5m])
==
0
)
)
for: {{ .for }}
labels:
severity: {{ .severity }}
{{- with $values.prometheusRules.additionalRuleLabels }}
{{ toYaml . | nindent 8 }}
{{- end }}
{{- end }}

{{- if .Values.autoscaling.enabled }}
- name: {{ .Release.Namespace }}.{{ .Release.Name }}.{{ .Chart.Name }}.HorizontalPodAutoscalerRules
rules:
Expand Down Expand Up @@ -279,4 +250,3 @@ spec:
{{- end }}

{{- end }}

5 changes: 0 additions & 5 deletions charts/simple-app/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -786,11 +786,6 @@ prometheusRules:
severity: warning
for: 15m

# -- Deployment has not matched the expected number of replicas
DeploymentReplicasMismatch:
severity: warning
for: 15m

# -- HPA has not matched descired number of replicas
HpaReplicasMismatch:
severity: warning
Expand Down

0 comments on commit 733b23d

Please sign in to comment.