Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vram not clearing up and task stuck generating #179

Open
sentveremailbutemaildoesntrecognize2 opened this issue Nov 16, 2024 · 1 comment

Comments

@sentveremailbutemaildoesntrecognize2

ubuntu 22.04.05
rx5700tx
I've used this to install Comfy UI and generate a bunch of images which worked out of the box but occasionally the Vram wont free up and the GPU will be at a load as if it generating. i'm not tech savvy so tried using gpt to fix it but can't even get to manually clear\reset the ram
btw, when running the script to build the things, it did run out of ran about 3 times, (how do i run it to build only 1 specific package?)

When it gets 'stuck' i've tried
ps aux | grep python
kill
kills the cmd that is chugging cpu at the moment but i can still see the Vram isn't free and GPU temp sitting at working temp.

tried running a cmd with
python3
import torch
torch.cuda.empty_cache() (i think it was this one, not 100% sure rn)

What information can i profide to help see where's the issue?

@lamikr
Copy link
Owner

lamikr commented Nov 18, 2024

Thanks for the report, I am now working also with another gfx1010 issue and could try to work with this one also.

  1. Would you be able to provide some exact steps how to get this triggered, that would ease me to investigating this.

  2. About the out of memory errors during the build time, how many cpu cores and memory you have on your system?
    Process count used by each app during the build time affects to the amount of memory required and I set default values for those in file binfo/envsetup.sh. There I try to check the memory/cpu core count and then define 3 different environment variables to specify the cpu count used for building the applications. Most of the processes uses the cpu count defined by the variable BUILD_CPU_COUNT_DEFAULT. In some binfo files where I have noticed the build process to be more memory hungry I however override that and use moderate/safe options.

BUILD_CPU_COUNT_DEFAULT
BUILD_CPU_COUNT_MODERATE
BUILD_CPU_COUNT_SAFE

If you copy the file envsetup_user_template.sh to envsetup_user.sh and then modify those variables from that file, then you can override the values that envsetup.sh sets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants