Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render: fail if no oauth image is found #1669

Conversation

jacobbaungard
Copy link
Contributor

For some unknown reason, during upgrades from ACM 2.11, the template image is briefly used, instead of the image from the openshift imagestream. The template image is not available in disconnected environments, and as a result causes the pods be unable to come up.

This is a problem especially for alertmanager. Because of it being a statefulset when it gets into the bad state, with the wrong image, it doesn't automatically recover on the next reconcile. This requires manual intervention to fix.

Instead with this PR, we make the reconcile fail if we do not find the oauth image. This will make it retry later, when the imagestream is able to be found.

For some unknown reason, during upgrades from ACM 2.11, the template
image is briefly used, instead of the image from the openshift
imagestream. The template image is not available in disconnected
environments, and as a result causes the pods be unable to come up.

This is a problem especially for alertmanager. Because of it being a
statefulset when it gets into the bad state, with the wrong image, it
doesn't automatically recover on the next reconcile. This requires
manual intervention to fix.

Instead with this PR, we make the reconcile fail if we do not find the
oauth image. This will make it retry later, when the imagestream is able
to be found.

Signed-off-by: Jacob Baungard Hansen <[email protected]>
@jacobbaungard jacobbaungard force-pushed the ACM-15525-fail-when-imagestream-not-found branch from 69fdfc9 to 6f52027 Compare November 14, 2024 11:59
Instead of passing through the `ImageV1Client` we instead use the
interface `ImageV1Interface`. This is so we're able to pass in a faked
ImageClient for tests. This is needed because since reconciles, as per
the previous commit, will actually fail if we cannot get the image from
the imagestream.

Signed-off-by: Jacob Baungard Hansen <[email protected]>
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 70%)

See analysis details on SonarQube Cloud

@jacobbaungard
Copy link
Contributor Author

/test test-e2e

1 similar comment
@jacobbaungard
Copy link
Contributor Author

/test test-e2e

@jacobbaungard
Copy link
Contributor Author

Note this causes reconciles to fail (intentionally) if we cannot find the imagestream. Not sure if there is a more graceful method? It will basically keep all images on the older versions during an upgrade, until the we can get the correct image. It seems we will keep trying to reconcile until it works, so probably works OK.

Error:

2024-11-15T16:06:02.194Z ERROR controller_multiclustermonitoring Failed to render multiClusterMonitoring templates {"Request.Namespace": "open-cluster-management", "Request.Name": "mch-updated-request", "error": "failed to get OAuth image for Grafana"}
github.com/stolostron/multicluster-observability-operator/operators/multiclusterobservability/controllers/multiclusterobservability.(*MultiClusterObservabilityReconciler).Reconcile
/workspace/operators/multiclusterobservability/controllers/multiclusterobservability/multiclusterobservability_controller.go:280
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222
2024-11-15T16:06:02.194Z ERROR Reconciler error {"controller": "multiclusterobservability", "controllerGroup": "observability.open-cluster-management.io", "controllerKind": "MultiClusterObservability", "MultiClusterObservability": {"name":"mch-updated-request","namespace":"open-cluster-management"}, "namespace": "open-cluster-management", "name": "mch-updated-request", "reconcileID": "8fc754de-955e-4748-8faa-233d48b97297", "error": "failed to get OAuth image for Grafana"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:324
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222

@jacobbaungard
Copy link
Contributor Author

/test test-e2e

Copy link

openshift-ci bot commented Nov 18, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jacobbaungard, saswatamcode

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [jacobbaungard,saswatamcode]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jacobbaungard
Copy link
Contributor Author

/cherrypick release-2.13

@openshift-cherrypick-robot
Copy link
Collaborator

@jacobbaungard: once the present PR merges, I will cherry-pick it on top of release-2.13 in a new PR and assign it to you.

In response to this:

/cherrypick release-2.13

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jacobbaungard
Copy link
Contributor Author

/override "SonarCloud Code Analysis"

Copy link

openshift-ci bot commented Nov 18, 2024

@jacobbaungard: Overrode contexts on behalf of jacobbaungard: SonarCloud Code Analysis

In response to this:

/override "SonarCloud Code Analysis"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jacobbaungard
Copy link
Contributor Author

/override "ci/prow/sonarcloud"

Copy link

openshift-ci bot commented Nov 18, 2024

@jacobbaungard: Overrode contexts on behalf of jacobbaungard: ci/prow/sonarcloud

In response to this:

/override "ci/prow/sonarcloud"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit d969023 into stolostron:release-2.12 Nov 18, 2024
10 of 11 checks passed
@openshift-cherrypick-robot
Copy link
Collaborator

@jacobbaungard: new pull request created: #1671

In response to this:

/cherrypick release-2.13

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jacobbaungard
Copy link
Contributor Author

/cherrypick main

@openshift-cherrypick-robot
Copy link
Collaborator

@jacobbaungard: new pull request created: #1672

In response to this:

/cherrypick main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants