Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tempo-distributed] fix: add autoscaling for tempo-distributed metrics-generator #3430

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion charts/tempo-distributed/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: tempo-distributed
description: Grafana Tempo in MicroService mode
type: application
version: 1.22.1
version: 1.23.0
appVersion: 2.6.0
engine: gotpl
home: https://grafana.com/docs/tempo/latest/
Expand Down
12 changes: 11 additions & 1 deletion charts/tempo-distributed/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# tempo-distributed

![Version: 1.22.1](https://img.shields.io/badge/Version-1.22.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 2.6.0](https://img.shields.io/badge/AppVersion-2.6.0-informational?style=flat-square)
![Version: 1.23.0](https://img.shields.io/badge/Version-1.23.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 2.6.0](https://img.shields.io/badge/AppVersion-2.6.0-informational?style=flat-square)

Grafana Tempo in MicroService mode

Expand Down Expand Up @@ -641,6 +641,16 @@ The memcached default args are removed and should be provided manually. The sett
| metricsGenerator.annotations | object | `{}` | Annotations for the metrics-generator StatefulSet |
| metricsGenerator.appProtocol | object | `{"grpc":null}` | Adds the appProtocol field to the metricsGenerator service. This allows metricsGenerator to work with istio protocol selection. |
| metricsGenerator.appProtocol.grpc | string | `nil` | Set the optional grpc service protocol. Ex: "grpc", "http2" or "https" |
| metricsGenerator.autoscaling | object | `{"enabled":false,"hpa":{"behavior":{},"enabled":false,"targetCPUUtilizationPercentage":100,"targetMemoryUtilizationPercentage":null},"keda":{"enabled":false,"triggers":[]},"maxReplicas":3,"minReplicas":1}` | Autoscaling configurations |
| metricsGenerator.autoscaling.enabled | bool | `false` | Enable autoscaling for the metrics-generator |
| metricsGenerator.autoscaling.hpa | object | `{"behavior":{},"enabled":false,"targetCPUUtilizationPercentage":100,"targetMemoryUtilizationPercentage":null}` | Autoscaling via HPA object |
| metricsGenerator.autoscaling.hpa.behavior | object | `{}` | Autoscaling behavior configuration for the metrics-generator |
| metricsGenerator.autoscaling.hpa.targetCPUUtilizationPercentage | int | `100` | Target CPU utilisation percentage for the metrics-generator |
| metricsGenerator.autoscaling.hpa.targetMemoryUtilizationPercentage | string | `nil` | Target memory utilisation percentage for the metrics-generator |
| metricsGenerator.autoscaling.keda | object | `{"enabled":false,"triggers":[]}` | Autoscaling via keda/ScaledObject |
| metricsGenerator.autoscaling.keda.triggers | list | `[]` | List of autoscaling triggers for the metrics-generator |
| metricsGenerator.autoscaling.maxReplicas | int | `3` | Maximum autoscaling replicas for the metrics-generator |
| metricsGenerator.autoscaling.minReplicas | int | `1` | Minimum autoscaling replicas for the metrics-generator |
| metricsGenerator.config | object | `{"metrics_ingestion_time_range_slack":"30s","processor":{"service_graphs":{"dimensions":[],"histogram_buckets":[0.1,0.2,0.4,0.8,1.6,3.2,6.4,12.8],"max_items":10000,"wait":"10s","workers":10},"span_metrics":{"dimensions":[],"histogram_buckets":[0.002,0.004,0.008,0.016,0.032,0.064,0.128,0.256,0.512,1.02,2.05,4.1]}},"registry":{"collection_interval":"15s","external_labels":{},"stale_duration":"15m"},"storage":{"path":"/var/tempo/wal","remote_write":[],"remote_write_add_org_id_header":true,"remote_write_flush_deadline":"1m","wal":null},"traces_storage":{"path":"/var/tempo/traces"}}` | More information on configuration: https://grafana.com/docs/tempo/latest/configuration/#metrics-generator |
| metricsGenerator.config.processor.service_graphs | object | `{"dimensions":[],"histogram_buckets":[0.1,0.2,0.4,0.8,1.6,3.2,6.4,12.8],"max_items":10000,"wait":"10s","workers":10}` | For processors to be enabled and generate metrics, pass the names of the processors to overrides.metrics_generator_processors value like [service-graphs, span-metrics] |
| metricsGenerator.config.processor.service_graphs.dimensions | list | `[]` | resource and span attributes and are added to the metrics if present. |
Expand Down
47 changes: 47 additions & 0 deletions charts/tempo-distributed/templates/metrics-generator/hpa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
{{- if and .Values.metricsGenerator.autoscaling.enabled .Values.metricsGenerator.autoscaling.hpa.enabled }}
{{- $apiVersion := include "tempo.hpa.apiVersion" . -}}
{{ $dict := dict "ctx" . "component" "metrics-generator" }}
apiVersion: {{ $apiVersion }}
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "tempo.resourceName" $dict }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "tempo.labels" $dict | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: {{ .Values.metricsGenerator.kind }}
name: {{ include "tempo.resourceName" $dict }}
minReplicas: {{ .Values.metricsGenerator.autoscaling.minReplicas }}
maxReplicas: {{ .Values.metricsGenerator.autoscaling.maxReplicas }}
{{- with .Values.metricsGenerator.autoscaling.hpa.behavior }}
behavior:
{{- toYaml . | nindent 4 }}
{{- end }}
metrics:
{{- with .Values.metricsGenerator.autoscaling.hpa.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: memory
{{- if (eq $apiVersion "autoscaling/v2") }}
target:
type: Utilization
averageUtilization: {{ . }}
{{- else }}
targetAverageUtilization: {{ . }}
{{- end }}
{{- end }}
{{- with .Values.metricsGenerator.autoscaling.hpa.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
{{- if (eq $apiVersion "autoscaling/v2") }}
target:
type: Utilization
averageUtilization: {{ . }}
{{- else }}
targetAverageUtilization: {{ . }}
{{- end }}
{{- end }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{{- if and .Values.metricsGenerator.autoscaling.enabled .Values.metricsGenerator.autoscaling.keda.enabled }}
{{ $dict := dict "ctx" . "component" "metrics-generator" }}
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: {{ include "tempo.resourceName" $dict }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "tempo.labels" $dict | nindent 4 }}
spec:
minReplicaCount: {{ .Values.metricsGenerator.autoscaling.minReplicas }}
maxReplicaCount: {{ .Values.metricsGenerator.autoscaling.maxReplicas }}
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "tempo.resourceName" $dict }}
triggers:
{{- range .Values.metricsGenerator.autoscaling.keda.triggers }}
- type: {{ .type | quote }}
metadata:
serverAddress: {{ .metadata.serverAddress }}
threshold: {{ .metadata.threshold | quote }}
query: |
{{- .metadata.query | nindent 8 }}
{{- end }}
{{- end }}
30 changes: 30 additions & 0 deletions charts/tempo-distributed/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,36 @@ metricsGenerator:
repository: null
# -- Docker image tag for the metrics-generator image. Overrides `tempo.image.tag`
tag: null
# -- Autoscaling configurations
autoscaling:
# -- Enable autoscaling for the metrics-generator
enabled: false
# -- Minimum autoscaling replicas for the metrics-generator
minReplicas: 1
# -- Maximum autoscaling replicas for the metrics-generator
maxReplicas: 3
# -- Autoscaling via HPA object
hpa:
enabled: false
# -- Autoscaling behavior configuration for the metrics-generator
behavior: {}
# -- Target CPU utilisation percentage for the metrics-generator
targetCPUUtilizationPercentage: 100
# -- Target memory utilisation percentage for the metrics-generator
targetMemoryUtilizationPercentage:
# -- Autoscaling via keda/ScaledObject
keda:
# requires https://keda.sh/
enabled: false
# -- List of autoscaling triggers for the metrics-generator
triggers: []
# - type: prometheus
# metadata:
# serverAddress: "http://<prometheus-host>:9090"
# threshold: "250"
# query: |-
# sum(prometheus_remote_storage_shards_desired{job="default/metrics-generator"} /
# prometheus_remote_storage_shards_max{job="default/metrics-generator"})by(job)
# -- The name of the PriorityClass for metrics-generator pods
priorityClassName: null
# -- Labels for metrics-generator pods
Expand Down