Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrademon #656

Merged
merged 29 commits into from
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
2416623
Auto-migrated all deployed Grafana dashboards (except istio,nginx)
gsmith-sas Apr 26, 2024
7bbf23c
Alerts - manual fixes
gsmith-sas Apr 26, 2024
4974459
Kubernetes Cluster - manual fixes (revert to gauges)
gsmith-sas Apr 26, 2024
20bb7b0
Kubernetes Deployments - manual fixes (revert to gauges)
gsmith-sas Apr 26, 2024
127243e
Perf Container Utilization - manual fixes (data tables in secondary c…
gsmith-sas Apr 26, 2024
6deead5
Perf Kubernetes Headroom - manual fixes (drop POD col in data table)
gsmith-sas Apr 26, 2024
d7b3295
NGINX - Automigrated NGINX dashboard (from Grafana 10.4.1)
gsmith-sas Apr 27, 2024
988e9fc
NGINX - manual fixes (fix tables)
gsmith-sas Apr 27, 2024
cf3431f
Merge branch 'grafdash11' into upgrademon
gsmith-sas Jun 18, 2024
fbe73c9
Change plugin version from '11.0.0-preview' to '11.0.0' for some dash…
gsmith-sas Jun 21, 2024
cda031c
Revised Grafana dashboard JSON defs
gsmith-sas Jun 21, 2024
2af8c37
Added migrated log-enabled SAS Viya-specific Grafana dashboards
gsmith-sas Jun 21, 2024
6fc071a
Added more migrated Grafana dashboards
gsmith-sas Jun 21, 2024
8bc5841
Migrate OpenSearch Grafana dashboard
gsmith-sas Jun 21, 2024
d36bbfc
Removed migration notations from Grfana dashboard names
gsmith-sas Jun 21, 2024
302aac1
Removed migration notations from Grfana dashboard names(2)
gsmith-sas Jun 21, 2024
c914e6e
Further tweaks to Grafana dashboards
gsmith-sas Jun 26, 2024
0f49d9f
mend
gsmith-sas Jun 26, 2024
de86a68
Update/migrate RabbitMQ-related Grafana dashboards
gsmith-sas Jun 27, 2024
759f6e4
Grafana dashboard migration: further tweaks
gsmith-sas Jun 27, 2024
2c212fc
CHANGELOG.md: updated with Grafana dashboard changes
gsmith-sas Jun 27, 2024
5cffe85
Potential Fixes for problematic 'inherited' Grafana dashboards
gsmith-sas Jun 28, 2024
8bf0af5
Upgrade monitoring components: initial set of target versions
gsmith-sas Jun 28, 2024
1ac041e
TEMP FIX: Replace upstream versions of 4 Grafana dashboards with our …
gsmith-sas Jul 2, 2024
73395ec
Upgrade monitoring components
gsmith-sas Jul 2, 2024
8339307
Set V4M_TEMP_REPLACE_PROBLEMATIC_MIXIN_DASHBOARDS to 'true'
gsmith-sas Jul 2, 2024
5c62baf
Change logic for setting flag for mixin dashboard fix
gsmith-sas Jul 2, 2024
d7fbd17
Corrected CHANGELOG.md
gsmith-sas Jul 2, 2024
2a08f89
Corrected Grafana dashboard UIDs to match original values
gsmith-sas Jul 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
# SAS Viya Monitoring for Kubernetes

## Unreleased
* **Metrics**
* [CHANGE] Grafana dashboards for RabbitMQ upgraded to newer versions
* [CHANGE] All Grafana dashboards (maintained as part of this project) migrated to Grafana 11
* [CHANGE] Some Grafana dashboards inherited from the Kube-Prometheus Stack Helm chart do not
work with Grafana 11.x due to Angular migration or other issues. As a **temporary** fix, we have
removed these dashboards and replaced them with our versions of them. **This fix will be removed when these issues have been resolved.**
* [UPGRADE] Kube-Prometheus Stack Helm chart has been upgraded from 56.6.2 to 61.1.1.
* [UPGRADE] Grafana Helm Chart (for OpenShift deployments) has been upgraded from 7.3.0 to 8.2.1.
* [UPGRADE] Prometheus Pushgateway Helm chart has been upgraded from 2.6.0 to 2.13.0.
* [UPGRADE] Alertmanager has been upgraded from 0.26.0 to 0.27.0.
* [UPGRADE] The config-reloader has been upgraded from 0.71.2 to 0.75.0.
* [UPGRADE] Grafana has been upgraded from 10.3.3 to 11.1.0.
* [UPGRADE] The k8s-sidecar has been upgraded from 1.25.4 to 1.26.1.
* [UPGRADE] Kube-State-Metrics has been upgraded from 2.10.1 to 2.12.0.
* [UPGRADE] Node-Exporter has been upgraded from 1.7.0 to 1.8.1.
* [UPGRADE] Prometheus has been upgraded from 2.49.1 to 2.53.0.
* [UPGRADE] Prometheus Operator has been upgraded from 0.71.2 to 0.75.0.
* [UPGRADE] Prometheus Pushgateway has been upgraded from 1.7.0 to 1.8.0.


## Version 1.2.26 (18JUN2024)
* **Overall**
* [CHANGE] Eliminated use of `--short` option (deprecated in Kubernetes 1.28) from `kubectl version` commands
Expand Down
26 changes: 13 additions & 13 deletions component_versions.env
Original file line number Diff line number Diff line change
Expand Up @@ -36,33 +36,33 @@ OSD_FULL_IMAGE="docker.io/opensearchproject/opensearch-dashboards:2.12.0"
#Grafana (when deployed on OpenShift)
OPENSHIFT_GRAFANA_CHART_REPO=grafana
OPENSHIFT_GRAFANA_CHART_NAME=grafana
OPENSHIFT_GRAFANA_CHART_VERSION=7.3.0
OPENSHIFT_GRAFANA_CHART_VERSION=8.2.1
OPENSHIFT_OAUTHPROXY_FULL_IMAGE="registry.redhat.io/openshift4/ose-oauth-proxy:latest"

#Grafana (everywhere)
GRAFANA_FULL_IMAGE="docker.io/grafana/grafana:10.3.3"
GRAFANA_SIDECAR_FULL_IMAGE="quay.io/kiwigrid/k8s-sidecar:1.25.4"
GRAFANA_FULL_IMAGE="docker.io/grafana/grafana:11.1.0"
GRAFANA_SIDECAR_FULL_IMAGE="quay.io/kiwigrid/k8s-sidecar:1.26.1"

#Kube-Prometheus Stack
KUBE_PROM_STACK_CHART_REPO=prometheus-community
KUBE_PROM_STACK_CHART_NAME=kube-prometheus-stack
KUBE_PROM_STACK_CHART_VERSION=56.6.2
ALERTMANAGER_FULL_IMAGE="quay.io/prometheus/alertmanager:v0.26.0"
KUBE_PROM_STACK_CHART_VERSION=61.1.1
ALERTMANAGER_FULL_IMAGE="quay.io/prometheus/alertmanager:v0.27.0"
ADMWEBHOOK_FULL_IMAGE="registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6"
KSM_FULL_IMAGE="registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.10.1"
NODEXPORT_FULL_IMAGE="quay.io/prometheus/node-exporter:v1.7.0"
PROMETHEUS_FULL_IMAGE="quay.io/prometheus/prometheus:v2.49.1"
PROMOP_FULL_IMAGE="quay.io/prometheus-operator/prometheus-operator:v0.71.2"
CONFIGRELOAD_FULL_IMAGE="quay.io/prometheus-operator/prometheus-config-reloader:v0.71.2"
KSM_FULL_IMAGE="registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.12.0"
NODEXPORT_FULL_IMAGE="quay.io/prometheus/node-exporter:v1.8.1"
PROMETHEUS_FULL_IMAGE="quay.io/prometheus/prometheus:v2.53.0"
PROMOP_FULL_IMAGE="quay.io/prometheus-operator/prometheus-operator:v0.75.0"
CONFIGRELOAD_FULL_IMAGE="quay.io/prometheus-operator/prometheus-config-reloader:v0.75.0"

#Pushgateway
PUSHGATEWAY_CHART_REPO=prometheus-community
PUSHGATEWAY_CHART_NAME=prometheus-pushgateway
PUSHGATEWAY_CHART_VERSION=2.6.0
PUSHGATEWAY_FULL_IMAGE="quay.io/prometheus/pushgateway:v1.7.0"
PUSHGATEWAY_CHART_VERSION=2.13.0
PUSHGATEWAY_FULL_IMAGE="quay.io/prometheus/pushgateway:v1.8.0"

#Prometheus Operator CRD
PROM_OPERATOR_CRD_VERSION=v0.71.2
PROM_OPERATOR_CRD_VERSION=v0.75.0

#Tempo
TEMPO_CHART_REPO=grafana
Expand Down
21 changes: 21 additions & 0 deletions monitoring/bin/deploy_monitoring_cluster.sh
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,27 @@ fi
echo ""
monitoring/bin/deploy_dashboards.sh

# 01JUL24 Temporary Fix
# Some Grafana dashboards inherited from the Kube-Prometheus Stack Helm
# chart do not work with Grafana 11 due to Angular migration or other
# issues. As a **temporary** fix, we will remove these dashboards and
# replace them with our versions of them. This fix will be removed
# when these issues have been resolved.
V4M_TEMP_REPLACE_PROBLEMATIC_MIXIN_DASHBOARDS="${V4M_TEMP_REPLACE_PROBLEMATIC_MIXIN_DASHBOARDS:-true}"
if [ "$V4M_TEMP_REPLACE_PROBLEMATIC_MIXIN_DASHBOARDS" == "true" ]; then
log_info "Replacing some Kube-Prometheus Stack-supplied Grafana dashboards with our own versions due to incompatabilities."

# remove configMaps definining exising Grafana dashboards
kubectl -n $MON_NS delete configmap v4m-cluster-total --ignore-not-found
kubectl -n $MON_NS delete configmap v4m-namespace-by-pod --ignore-not-found
kubectl -n $MON_NS delete configmap v4m-namespace-by-workload --ignore-not-found
kubectl -n $MON_NS delete configmap v4m-prometheus --ignore-not-found

# deploy our versions of these dashboards
monitoring/bin/deploy_dashboards.sh monitoring/dashboards/mixinfixes

fi

set +e
# call function to get HTTP/HTTPS ports from ingress controller
get_ingress_ports
Expand Down
Loading
Loading