Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube down should be able to remove "external" storage containers #20025

Closed
vrothberg opened this issue Sep 19, 2023 Discussed in #20021 · 8 comments · Fixed by #20457
Closed

kube down should be able to remove "external" storage containers #20025

vrothberg opened this issue Sep 19, 2023 Discussed in #20021 · 8 comments · Fixed by #20457
Labels
kube locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@vrothberg
Copy link
Member

Discussed in #20021

Originally posted by poperigby September 18, 2023
I have a collection of containers that are orchestrated with .kube files. Seemingly at random, a service will fail to start on reboot. When I go to start it manually with podman kube play service.yaml, it will say something like this:

Error: creating container storage: the container name "authelia-pod-authelia" is already in use by b8bf71cb1c5029f2dfee4f48e2be50c4bdeff7af9c99d5cd09d48b25eb8cbe7d. You have to remove that container to be able to reuse that name: that name is already in use

I then have to run something like this to fix it:

podman rm --storage b8bf71cb1c5029f2dfee4f48e2be50c4bdeff7af9c99d5cd09d48b25eb8cbe7d

It just recently (about a month ago), started doing this, so I'm not sure what happened.

Here's one of my .kube files:

[Install]
WantedBy=default.target

[Unit]
Requires=caddy.service
After=caddy.service

[Service]
TimeoutSec=900

[Kube]
Yaml=services/authelia/service.yaml
Network=reverse-proxy.network

Here's one of my Kubernetes style YAML files:

apiVersion: v1
kind: Pod
metadata:
  labels:
    app: authelia-pod
  name: authelia-pod
spec:
  hostname: authelia-pod
  securityContext:
    seLinuxOptions:
      type: spc_t
  containers:
  - name: authelia
    image: ghcr.io/authelia/authelia:latest
    imagePullPolicy: Never
    #args:
    #- --config
    #- /config/configuration.yml
    env:
    - name: AUTHELIA_JWT_SECRET
      value: REDACTED
    - name: AUTHELIA_STORAGE_ENCRYPTION_KEY
      value: REDACTED
    - name: AUTHELIA_NOTIFIER_SMTP_PASSWORD
      value: REDACTED
    - name: AUTHELIA_IDENTITY_PROVIDERS_OIDC_HMAC_SECRET
      value: REDACTED
    - name: AUTHELIA_IDENTITY_PROVIDERS_OIDC_ISSUER_PRIVATE_KEY
      value: REDACTED
    - name: AUTHELIA_SESSION_SECRET
      value: REDACTED
    volumeMounts:
    - mountPath: /config
      name: authelia-config
  - name: redis
    image: docker.io/library/redis:latest
    #args:
    #- redis-server
    volumeMounts:
    - mountPath: /data
      name: redis-data
  volumes:
  - name: redis-data
    persistentVolumeClaim:
      claimName: redis
  - hostPath:
      path: /home/cassidy/.config/containers/systemd/services/authelia/authelia
      type: Directory
    name: authelia-config
```</div>
@vrothberg
Copy link
Member Author

Copying my other comment from the discussion:

The generated systemd units have a ExecStart=/usr/bin/podman kube play --replace (note the --replace). So to me it looks like the unit got shot down on stop (i.e., ExecStopPost=/usr/bin/podman kube down).

Without a reproducer we cannot be sure but I think it makes sense to have kube down and --replace remove "external" storage containers to account for such cases.

@Luap99
Copy link
Member

Luap99 commented Sep 19, 2023

Note this isn't strictly related to kube, podman run --replace would be another case where this might be important.
With 4.6 I saw several instances of storage containers leaking, some even unable to remove unless you manually mount something in the storage dir (#19913 (comment))

Most likely this is the result of increased error reporting in c/storage: #18831 (comment)
I would expect the latest fix (not in v4.6) to help in many cases.

Regardless of the cause I think removing the storage container by default makes sense, otherwise we could break all these non interactive systemd units which is very bad (and very much not self healing if a users has to fix this by hand).

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@poperigby
Copy link

Bump

@rhatdan
Copy link
Member

rhatdan commented Oct 20, 2023

@umohnani8 PTAL

@rhatdan
Copy link
Member

rhatdan commented Oct 20, 2023

@vrothberg Did you ever fix this?

@rhatdan rhatdan added the kube label Oct 20, 2023
@vrothberg
Copy link
Member Author

@rhatdan no, I only filed the issue

@tmzullinger
Copy link

Note this isn't strictly related to kube, podman run --replace would be another case where this might be important. With 4.6 I saw several instances of storage containers leaking, some even unable to remove unless you manually mount something in the storage dir (#19913 (comment))

Most likely this is the result of increased error reporting in c/storage: #18831 (comment) I would expect the latest fix (not in v4.6) to help in many cases.

Regardless of the cause I think removing the storage container by default makes sense, otherwise we could break all these non interactive systemd units which is very bad (and very much not self healing if a users has to fix this by hand).

Agreed. I have systemd generated units which call podman pod start | stop which recently failed to start after a reboot. I had to manually rm the storage containers. There is no --replace option for podman pod AFAICT, so I'm not sure how that should be fixed.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jan 29, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kube locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants