Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.8 upgrades: Juju Error on refresh of Podspec charm with deployment in metadata to Sidecar #732

Closed
NohaIhab opened this issue Oct 24, 2023 · 5 comments
Labels
bug Something isn't working Kubeflow 1.8 This issue affects the Charmed Kubeflow 1.8 release

Comments

@NohaIhab
Copy link
Contributor

NohaIhab commented Oct 24, 2023

Bug Description

When doing an upgrade of Charmed Kubeflow 1.7 to 1.8, some charms cannot be refreshed with Juju due to the error:

ERROR Juju on containers does not support updating deployment info for services.
The new charm's metadata contains updated deployment info.
You'll need to deploy a new charm rather than upgrading if you need this change.
 not supported (not supported)

This applies to charms that were Podspec in 1.7 and had a deployment field specified in the metadata.yaml, then were rewritten to Sidecar in 1.8.

Currently, the only way to upgrade those charms is to remove and redeploy them, this is problematic because it'll cause loss of user data (CRDs are removed, hence CRs created by users are removed).

Affected charms

  • jupyter-controller
  • argo-controller
  • kfp-persistence
  • kfp-schedwf
  • kfp-viewer

Proposed solution:

Modify the affected charms in their corresponding 1.7 branches to never remove CRDs on remove, and in the upgrade guide do:

  1. refresh the charm to the new 1.7 revision by simply doing juju refresh
  2. refresh to 1.8 channel (currently we will test with latest/edge since we still haven't released to 1.8 channels)

To Reproduce

juju deploy jupyter-controller --channel=1.7/stable
juju refresh jupyter-controller --channel=latest/edge --trust

Environment

juju 3.1/stable
microk8s 1.25-strict/stable

@NohaIhab NohaIhab added bug Something isn't working Kubeflow 1.8 This issue affects the Charmed Kubeflow 1.8 release labels Oct 24, 2023
@misohu
Copy link
Member

misohu commented Oct 24, 2023

Did some digging about the technical solution. There is a problem. These charms are implemented in PodSpec and neither of them implements the on remove hook. The thing is that the crds are removed automatically even without us implementing it. Steps to reproduce:

juju deploy kfp-schedwf --channel=2.0/stable
# check crds scheduledworkflows.kubeflow.org is there 
juju remove-application kfp-schedwf
# check crds scheduledworkflows.kubeflow.org is gone

I am not sure if there is a way for us to override this default pod spec behavior.

I was also thinking about to use lightkube to recreate the crd on remove hook for the charms. But this is not possible as the lightkube needs --trust which cannot be used in podspec charms.

@kimwnasptd
Copy link
Contributor

@misohu the goal is to not let the CRD get a DELETE request at all. Even if we recreate the CRD, all user CRs will be lost since the CRD was deleted initially.

Is there no way we can control which resources get deleted when the PodSpec charm gets deleted?

@NohaIhab
Copy link
Contributor Author

Final Workaround

since juju uses annotations and labels to cleanup when removing a charm, we can remove these annotations and labels from CRDs to prevent the deletion of user resources.

The upgrade would go as follows:

  1. make sure all your workflows and notebooks are completed before running the upgrade
  2. remove labels and annotations of CRDs, for example for notebooks:
kubectl annotate crd notebooks.kubeflow.org controller.juju.is/id-
kubectl annotate crd notebooks.kubeflow.org model.juju.is/id-
kubectl label crd notebooks.kubeflow.org app.juju.is/created-by-
kubectl label crd notebooks.kubeflow.org app.kubernetes.io/managed-by-
kubectl label crd notebooks.kubeflow.org app.kubernetes.io/name-
kubectl label crd notebooks.kubeflow.org model.juju.is/name-

this step will be done for the following CRDs:

  • notebooks.kubeflow.org
  • workflows.argoproj.io
  • scheduledworkflows.kubeflow.org
  1. remove the old charm
  2. deploy the new charm

this way the CRDs will not get deleted, therefore the CRs will not get deleted when the charm is removed.

@kimwnasptd
Copy link
Contributor

Closing this as it's a juju bug, and we had a workaround

Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5152.

This message was autogenerated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Kubeflow 1.8 This issue affects the Charmed Kubeflow 1.8 release
Projects
Development

No branches or pull requests

3 participants