Skip to content

Commit

Permalink
[UPGRADE] Metric monitoring components upgraded; dashboards migrated …
Browse files Browse the repository at this point in the history
…to Grafana 11+ (#656)
  • Loading branch information
gsmith-sas authored Jul 10, 2024
1 parent fb1c059 commit 7666ebe
Show file tree
Hide file tree
Showing 34 changed files with 47,246 additions and 32,652 deletions.
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
# SAS Viya Monitoring for Kubernetes

## Unreleased
* **Metrics**
* [CHANGE] Grafana dashboards for RabbitMQ upgraded to newer versions
* [CHANGE] All Grafana dashboards (maintained as part of this project) migrated to Grafana 11
* [CHANGE] Some Grafana dashboards inherited from the Kube-Prometheus Stack Helm chart do not
work with Grafana 11.x due to Angular migration or other issues. As a **temporary** fix, we have
removed these dashboards and replaced them with our versions of them. **This fix will be removed when these issues have been resolved.**
* [UPGRADE] Kube-Prometheus Stack Helm chart has been upgraded from 56.6.2 to 61.1.1.
* [UPGRADE] Grafana Helm Chart (for OpenShift deployments) has been upgraded from 7.3.0 to 8.2.1.
* [UPGRADE] Prometheus Pushgateway Helm chart has been upgraded from 2.6.0 to 2.13.0.
* [UPGRADE] Alertmanager has been upgraded from 0.26.0 to 0.27.0.
* [UPGRADE] The config-reloader has been upgraded from 0.71.2 to 0.75.0.
* [UPGRADE] Grafana has been upgraded from 10.3.3 to 11.1.0.
* [UPGRADE] The k8s-sidecar has been upgraded from 1.25.4 to 1.26.1.
* [UPGRADE] Kube-State-Metrics has been upgraded from 2.10.1 to 2.12.0.
* [UPGRADE] Node-Exporter has been upgraded from 1.7.0 to 1.8.1.
* [UPGRADE] Prometheus has been upgraded from 2.49.1 to 2.53.0.
* [UPGRADE] Prometheus Operator has been upgraded from 0.71.2 to 0.75.0.
* [UPGRADE] Prometheus Pushgateway has been upgraded from 1.7.0 to 1.8.0.


## Version 1.2.26 (18JUN2024)
* **Overall**
* [CHANGE] Eliminated use of `--short` option (deprecated in Kubernetes 1.28) from `kubectl version` commands
Expand Down
26 changes: 13 additions & 13 deletions component_versions.env
Original file line number Diff line number Diff line change
Expand Up @@ -36,33 +36,33 @@ OSD_FULL_IMAGE="docker.io/opensearchproject/opensearch-dashboards:2.12.0"
#Grafana (when deployed on OpenShift)
OPENSHIFT_GRAFANA_CHART_REPO=grafana
OPENSHIFT_GRAFANA_CHART_NAME=grafana
OPENSHIFT_GRAFANA_CHART_VERSION=7.3.0
OPENSHIFT_GRAFANA_CHART_VERSION=8.2.1
OPENSHIFT_OAUTHPROXY_FULL_IMAGE="registry.redhat.io/openshift4/ose-oauth-proxy:latest"

#Grafana (everywhere)
GRAFANA_FULL_IMAGE="docker.io/grafana/grafana:10.3.3"
GRAFANA_SIDECAR_FULL_IMAGE="quay.io/kiwigrid/k8s-sidecar:1.25.4"
GRAFANA_FULL_IMAGE="docker.io/grafana/grafana:11.1.0"
GRAFANA_SIDECAR_FULL_IMAGE="quay.io/kiwigrid/k8s-sidecar:1.26.1"

#Kube-Prometheus Stack
KUBE_PROM_STACK_CHART_REPO=prometheus-community
KUBE_PROM_STACK_CHART_NAME=kube-prometheus-stack
KUBE_PROM_STACK_CHART_VERSION=56.6.2
ALERTMANAGER_FULL_IMAGE="quay.io/prometheus/alertmanager:v0.26.0"
KUBE_PROM_STACK_CHART_VERSION=61.1.1
ALERTMANAGER_FULL_IMAGE="quay.io/prometheus/alertmanager:v0.27.0"
ADMWEBHOOK_FULL_IMAGE="registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6"
KSM_FULL_IMAGE="registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.10.1"
NODEXPORT_FULL_IMAGE="quay.io/prometheus/node-exporter:v1.7.0"
PROMETHEUS_FULL_IMAGE="quay.io/prometheus/prometheus:v2.49.1"
PROMOP_FULL_IMAGE="quay.io/prometheus-operator/prometheus-operator:v0.71.2"
CONFIGRELOAD_FULL_IMAGE="quay.io/prometheus-operator/prometheus-config-reloader:v0.71.2"
KSM_FULL_IMAGE="registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.12.0"
NODEXPORT_FULL_IMAGE="quay.io/prometheus/node-exporter:v1.8.1"
PROMETHEUS_FULL_IMAGE="quay.io/prometheus/prometheus:v2.53.0"
PROMOP_FULL_IMAGE="quay.io/prometheus-operator/prometheus-operator:v0.75.0"
CONFIGRELOAD_FULL_IMAGE="quay.io/prometheus-operator/prometheus-config-reloader:v0.75.0"

#Pushgateway
PUSHGATEWAY_CHART_REPO=prometheus-community
PUSHGATEWAY_CHART_NAME=prometheus-pushgateway
PUSHGATEWAY_CHART_VERSION=2.6.0
PUSHGATEWAY_FULL_IMAGE="quay.io/prometheus/pushgateway:v1.7.0"
PUSHGATEWAY_CHART_VERSION=2.13.0
PUSHGATEWAY_FULL_IMAGE="quay.io/prometheus/pushgateway:v1.8.0"

#Prometheus Operator CRD
PROM_OPERATOR_CRD_VERSION=v0.71.2
PROM_OPERATOR_CRD_VERSION=v0.75.0

#Tempo
TEMPO_CHART_REPO=grafana
Expand Down
21 changes: 21 additions & 0 deletions monitoring/bin/deploy_monitoring_cluster.sh
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,27 @@ fi
echo ""
monitoring/bin/deploy_dashboards.sh

# 01JUL24 Temporary Fix
# Some Grafana dashboards inherited from the Kube-Prometheus Stack Helm
# chart do not work with Grafana 11 due to Angular migration or other
# issues. As a **temporary** fix, we will remove these dashboards and
# replace them with our versions of them. This fix will be removed
# when these issues have been resolved.
V4M_TEMP_REPLACE_PROBLEMATIC_MIXIN_DASHBOARDS="${V4M_TEMP_REPLACE_PROBLEMATIC_MIXIN_DASHBOARDS:-true}"
if [ "$V4M_TEMP_REPLACE_PROBLEMATIC_MIXIN_DASHBOARDS" == "true" ]; then
log_info "Replacing some Kube-Prometheus Stack-supplied Grafana dashboards with our own versions due to incompatabilities."

# remove configMaps definining exising Grafana dashboards
kubectl -n $MON_NS delete configmap v4m-cluster-total --ignore-not-found
kubectl -n $MON_NS delete configmap v4m-namespace-by-pod --ignore-not-found
kubectl -n $MON_NS delete configmap v4m-namespace-by-workload --ignore-not-found
kubectl -n $MON_NS delete configmap v4m-prometheus --ignore-not-found

# deploy our versions of these dashboards
monitoring/bin/deploy_dashboards.sh monitoring/dashboards/mixinfixes

fi

set +e
# call function to get HTTP/HTTPS ports from ingress controller
get_ingress_ports
Expand Down
Loading

0 comments on commit 7666ebe

Please sign in to comment.