Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to stop updating agent components temporarily #80

Open
JustinKuli opened this issue Mar 4, 2022 · 7 comments
Open

Add ability to stop updating agent components temporarily #80

JustinKuli opened this issue Mar 4, 2022 · 7 comments

Comments

@JustinKuli
Copy link
Member

Currently, any modifications to the ManifestWork will be reverted by the addon controller. So, the only modifications that can be made to addon agents on managed clusters are the ones specifically allowed by the addon.

In development or troubleshooting scenarios, it might be helpful to deploy a change to one addon on a cluster without updating the addon controller. For example, to add another argument to an addon agent container in order to enable a feature flag.

In stolostron, several controllers have "pause" annotations that can be placed on their resources, which prevent the controller from reconciling them. Something similar here could be useful.

@JustinKuli
Copy link
Member Author

JustinKuli commented Mar 8, 2022

We were able to implement this in our addon controller by overriding the Manifests function from the HelmAgentAddon, and returning an empty slice. See this commit for the example.

But now I'm worried that the empty slice is relying on the current implementation... specifically: https://github.com/open-cluster-management-io/addon-framework/blob/main/pkg/addonmanager/controllers/agentdeploy/controller.go. So right now, an empty slice results in the ManfestWork getting ignored, but it would also make sense if an empty slice resulted in deleting the existing ManifestWork.

@mprahl
Copy link
Member

mprahl commented Mar 8, 2022

Having a consistent way of doing this for all addons is appealing so we can define a consistent interface before 2.5 is released and support gets trained to use different pause annotations per addon.

@qiujian16
Copy link
Member

hrm, that is interesting. agree we should have a consistent way.

@qiujian16
Copy link
Member

/kind feature

@mprahl
Copy link
Member

mprahl commented Jul 26, 2022

It seems that this workaround no longer works due to this change:
0020a78#diff-b47b9c6da1d283ccb32074dd2dc9571541f1cc4cbcbbd7d6679f7f0acba17217R161

@JustinKuli
Copy link
Member Author

I'm thinking about a change to https://github.com/open-cluster-management-io/addon-framework/blob/main/pkg/addonmanager/controllers/agentdeploy/controller.go#L147-L167

So that a nil Manifests could be handled differently to an initialized list with length 0. Then nil could function as the pause, like it previously did.

Opinions?

@qiujian16
Copy link
Member

FYI @zhiweiyin318

openshift-merge-robot pushed a commit to open-cluster-management-io/governance-policy-addon-controller that referenced this issue Jul 27, 2022
The "pause" hack no longer works with the latest addon-framework
library. The alternative, which is in this commit, is to return an error
in the `Manifests` method when the pause annotation is set. It does spam
the logs but since this isn't a setting that should be used in
production, this seems okay until the following is addressed:
open-cluster-management-io/addon-framework#80

Relates:
stolostron/backlog#24362

Signed-off-by: mprahl <[email protected]>
openshift-merge-robot pushed a commit to stolostron/governance-policy-addon-controller that referenced this issue Jul 27, 2022
The "pause" hack no longer works with the latest addon-framework
library. The alternative, which is in this commit, is to return an error
in the `Manifests` method when the pause annotation is set. It does spam
the logs but since this isn't a setting that should be used in
production, this seems okay until the following is addressed:
open-cluster-management-io/addon-framework#80

Relates:
stolostron/backlog#24362

Signed-off-by: mprahl <[email protected]>
(cherry picked from commit 12cfe6e3f51534f0d10edab037bbaf49d7f298e2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants