-
-
Notifications
You must be signed in to change notification settings - Fork 821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uses 24 GB VRAM - what optimizations can be made? #104
Comments
+1 |
I'm not a programmer, but I've experimented a bit with xformers and sequential CPU offload. Without xformers the demo didn't work at all on my card (RTX4070). After using xformers it works, but the pipeline allocates about 6G of shared memory. Generating an image with a resolution of 1024 takes about 60 sec. Sequential CPU offload give me very small VRAM usage, but generating an image takes about 70 sec. https://huggingface.co/blog/simple_sdxl_optimizations |
Sequential CPU offload is giving me this error how did you fix?
|
Without looking at your code, I can't tell you how to fix it. Just a note, I use Linux running on WSL2, maybe that matters. I'll try to run it on Windows. I just checked, VRAM usage while generating image only 1G with Sequential CPU offload. |
yes Sequential CPU offload is amazing but I am getting |
Do you think that this can be added to https://github.com/ZHO-ZHO-ZHO/ComfyUI-InstantID/ so ComfyUI would run on a 4070 with InstantID? |
Sequential CPU offload removes the model from VRAM, so after each generated image the model would have to be reloaded from disk. I don't know if this is a good solution. At the moment, gradio demo with xformers works best IMHO. I experimented with ZHO-ZHO-ZHO/ComfyUI-InstantID/ and xformers, it works on 4070, but every second generation returns an OOM error. |
it is not same as running it on CPU |
Is there some code we could add or change in their existing files that would allow us to use these changes in Comfy? It might not be the best solution, but I've got a 3060 12GB that can't run it, so anything is better than the current situation. |
This PR solves it for me: ZHO-ZHO-ZHO/ComfyUI-InstantID#87 Also, we should have soon the official as well: cubiq/ComfyUI_IPAdapter_plus#242 (comment) |
The official support has me very excited to try. As for the modifications you suggest in the other thread, exactly where were you making these changes and in which files? I'm not familiar at all with this code, but I'm not afraid to mess around in it given good directions. :) |
The modifications can be seen here per files per lines: https://github.com/ZHO-ZHO-ZHO/ComfyUI-InstantID/pull/87/files |
well we also fixed the cpu off load in a standalone gradio app 1 click install with downloading models and on the fly model changing as well enabled vae slicing, xformers too shared on here so far : https://www.patreon.com/posts/1-click-very-gui-97769887 |
awww behind a paywall, :-( looks neat. Well at least with your GUI are you able to use the normal XL safetensors from Civit? Or this odd format that Aitrepreneur has got us using https://huggingface.co/stablediffusionapi |
my gui can use any civitAI models i made it support that way put into models folder and restart the app sorry for late reply |
Try this. https://github.com/cubiq/ComfyUI_InstantID - IMHO the best implementation so far. It runs on a 12G VRAM card without any problem and does not use all the VRAM. |
I managed to run the code in ~30 seconds on an RTX 3060 (12 GB VRAM). Approach:
I used WSL 2. |
I have installed as described in the gradio demo
It is literally filling entire 24 GB VRAM working but that much VRAM
what VRAM optimizations can be made? such as loading models perhaps 16 bit ? 8 bit?
I am making 1 click auto installer and an advanced gradio app
below is the pipeline
The text was updated successfully, but these errors were encountered: