You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that after deploying the gpumanager project, there is an issue with controlling GPU memory. I have set the pod quotas as follows, but the control over GPU memory and computing power does not take effect. What's even more peculiar is that if I wait for the same pod for over 60 minutes, the restrictions are likely to take effect. There are no obvious errors in the gpumanager logs.
resources:
limits:
tencent.com/vcuda-core: "50"
tencent.com/vcuda-memory: "32"
requests:
tencent.com/vcuda-core: "50"
tencent.com/vcuda-memory: "32"
The text was updated successfully, but these errors were encountered:
yangcheng-dev
changed the title
Gpumanager is unable to control computing power and GPU memory.
Gpumanager is unable to control GPU threshold and GPU memory.
Jan 18, 2024
It seems that after deploying the gpumanager project, there is an issue with controlling GPU memory. I have set the pod quotas as follows, but the control over GPU memory and computing power does not take effect. What's even more peculiar is that if I wait for the same pod for over 60 minutes, the restrictions are likely to take effect. There are no obvious errors in the gpumanager logs. resources: limits: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32" requests: tencent.com/vcuda-core: "50" tencent.com/vcuda-memory: "32"
Just remove the cuGetProcessAddress implement,it will cause this problem.
It seems that after deploying the gpumanager project, there is an issue with controlling GPU memory. I have set the pod quotas as follows, but the control over GPU memory and computing power does not take effect. What's even more peculiar is that if I wait for the same pod for over 60 minutes, the restrictions are likely to take effect. There are no obvious errors in the gpumanager logs.
resources:
limits:
tencent.com/vcuda-core: "50"
tencent.com/vcuda-memory: "32"
requests:
tencent.com/vcuda-core: "50"
tencent.com/vcuda-memory: "32"
The text was updated successfully, but these errors were encountered: