Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminates under EKS with msg="killing slirp4netns" #328

Open
dg424 opened this issue Oct 4, 2022 · 2 comments
Open

Terminates under EKS with msg="killing slirp4netns" #328

dg424 opened this issue Oct 4, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@dg424
Copy link

dg424 commented Oct 4, 2022

We're using Rootless DinD running on EKS worker nodes. We're intermittently getting the following failure:

main.main.func2
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/rootless-containers/[email protected]/cmd/rootlesskit/main.go:213
github.com/urfave/cli/v2.(*App).RunContext
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/urfave/cli/[email protected]/app.go:322
github.com/urfave/cli/v2.(*App).Run
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/urfave/cli/[email protected]/app.go:224
main.main
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/rootless-containers/[email protected]/cmd/rootlesskit/main.go:222
runtime.main
	/usr/local/go/src/runtime/proc.go:250
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1571
time="2022-10-03T21:27:19Z" level=debug msg="killing slirp4netns"
time="2022-10-03T21:27:19Z" level=debug msg="killed slirp4netns: signal: killed"
[rootlesskit:parent] error: exit status 1
child exited
github.com/rootless-containers/rootlesskit/pkg/parent.Parent
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/rootless-containers/[email protected]/pkg/parent/parent.go:275
main.main.func2
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/rootless-containers/[email protected]/cmd/rootlesskit/main.go:220
github.com/urfave/cli/v2.(*App).RunContext
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/urfave/cli/[email protected]/app.go:322
github.com/urfave/cli/v2.(*App).Run
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/urfave/cli/[email protected]/app.go:224
main.main
	/tmp/tmp.ccni3BnQLU/pkg/mod/github.com/rootless-containers/[email protected]/cmd/rootlesskit/main.go:222
runtime.main
	/usr/local/go/src/runtime/proc.go:250
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1571

Note: We currently have a startupProbe (https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-startup-probes) to verify that the DinD container is up with a time limit of 30 seconds (which seems more than enough).

Any idea or instructions to debug this further ?

@AkihiroSuda AkihiroSuda added the bug Something isn't working label Oct 5, 2022
@AkihiroSuda
Copy link
Member

a time limit of 30 seconds (which seems more than enough).

Does the error occur if you increase the limit?

child exited

Do you see any log from the rootless dind daemon?

@dg424
Copy link
Author

dg424 commented Oct 5, 2022

Hi @AkihiroSuda
Yes, it is still occuring after raising the limit from 5 seconds to 30 seconds, although less than it was with 5 seconds. I think we might have to increase the timeout to 1 minute now. But not sure if it will still continue to happen.
The logs I get from k8s after terminating the pod is what I pasted. Is there anything else I can do to find out the root cause here ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants