Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpu-manager run failed #168

Open
leofang94 opened this issue Aug 1, 2022 · 1 comment
Open

gpu-manager run failed #168

leofang94 opened this issue Aug 1, 2022 · 1 comment

Comments

@leofang94
Copy link

I run according readme with error:
copy /usr/local/host/lib/libnvidia-opticalflow.so.1 to /usr/local/nvidia/lib
copy /usr/local/host/lib64/libnvidia-opticalflow.so to /usr/local/nvidia/lib64
copy /usr/local/host/lib64/libnvidia-opticalflow.so.470.82.01 to /usr/local/nvidia/lib64
copy /usr/local/host/lib64/libnvidia-opticalflow.so.1 to /usr/local/nvidia/lib64
copy /usr/local/host/bin/nvidia-cuda-mps-control to /usr/local/nvidia/bin/
copy /usr/local/host/bin/nvidia-cuda-mps-server to /usr/local/nvidia/bin/
copy /usr/local/host/bin/nvidia-debugdump to /usr/local/nvidia/bin/
copy /usr/local/host/bin/nvidia-persistenced to /usr/local/nvidia/bin/
copy /usr/local/host/bin/nvidia-smi to /usr/local/nvidia/bin/
rebuild ldcache
launch gpu manager
E0801 03:25:05.805557 412774 server.go:133] Unable to set Type=notify in systemd service file?
E0801 03:25:06.821877 412774 server.go:170] can't load container response data, &os.PathError{Op:"open", Path:"/var/lib/kubelet/device-plugins/kubelet_internal_checkpoint", Err:0x2}

System Env:
Kubernrtes v1.21.5
kernel: 4.20.13-1.el7.elrepo.x86_64
OS: CentOS Linux release 7.5.1804 (Core)
docker: 19.03.15
runc:
Version: 1.0.2
GitCommit: v1.0.2-0-g52b36a2

How to solve it, Please.

@DennisYoung96
Copy link

i had save problem too.
so i made this file(/var/lib/kubelet/device-plugins/kubelet_internal_checkpoint) before gpu-manager start;
it works now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants