Replies: 3 comments
-
Are you trying controlnet tile + Ultimate SD Upscale? Setup would be something like this: Any luck using SDP (Scaled Dot Product) optimization instead of sub quad attention? You can set it in the UI, or use command line of: I've also noticed some issues that seem to cause crashing for me when upscaling if certain Textual Inversions are in the negative prompt - like Baddream, Unrealistic Dream especially seem very hard on VRAM when doing Ultimate SD upscale. |
Beta Was this translation helpful? Give feedback.
-
Yes, this is helpful! As long as Ultimate SD Upscale sticks to 512 tile sizes-- which is fine even for arbitrarily large image sizes-- controlnet works great. I'm still figuring out the 'tile' control type, but I got good results with Canny + 0.4 denoise strength + upscale to add details to an image. using Scaled Dot Product: Nope, this causes OOM on my system. |
Beta Was this translation helpful? Give feedback.
-
Did you try just adding upscaling and face restoring as postprocessing options ? Hi-res fix never worked correctly as you said and I was just upscaling in extras after creating the image and still the face wasn't very good unless I used roop or its derivatives to change the face to some known face. Now I enabled face restoration with codeformer (wasn't working before -I mean a few months ago-) and also added upscaling in postprocessing. I just create the image use two upscalers 1st is "TGHQFace8x_500k" for the face especially , and second another good upscaler you like for example "4x_foolhardy_Remacri" . "Upscaler 2 visibility" is at 0.5. This way when I upscale to 4x usually the time to generate is almost the same -maybe a few seconds- as standard generation with 4x image. BUT beware 8x almost doubles the generation time. I just tried some random prompt I found on the net. The results are somewhat "nfsw" ish but you can check. Here is a mediafire folder for png's. standard- 4x- 8x |
Beta Was this translation helpful? Give feedback.
-
I'm out here on a 8gb 5700 XT and it sucks. The Ishqqytiger version works great for me at around 512x768, but anything above that tends to go OOM. I'm jealous of the people with newer hardware who can make nice crisp detailed images at high resolutions. I have found ways of getting to a high resolution, but they all have drawbacks.
So: Does anyone have a method they like for getting good results, beyond 1024x1024? I don't care if it's slow.
Methods I've tried:
Ultimate SD upscale -- this is the best option I've found. It works reliably, but it's not good for adding new detail to images. If denoising is too low, it just recreates the image exactly as it is. If denoising is too high, it produces a collage of unrelated tiles. Seems like it's only good for faithfully recreating images with a little extra sharpness. Maybe there's an upscaler that's better that I haven't tried? I'm mostly doing photoreal landscapes in testing, so I'm using ESRGAN_4x.
Tiled diffusion with multidiffusion -- bad. unstable, has a bug where it usually produces one solid gray tile. quality of results is also dependent on getting exactly the right tile dimensions, overlap settings, etc. so there's a ton of trial and error to do for each image. Using tiled VAE as well can help avoid OOM, but it's incompatible with some of the command arguments I need to get this thing running. Tends to work for every single step and then go OOM right at the end, so this one is extra annoying.
Hires.Fix -- totally unusable on my system at all resolutions and scales.
Bringing a large image into img2img and manually inpainting small bits at a time -- this does technically work but it's a lot trial and error, and I have no good way of running img2img over the whole thing to smooth out the seams when it's done. it's possible to get nice big panoramic landscapes, for instance, but only by going over the image in photoshop and painting over errors and seams.
SDXL -- No good. Even on lowvram, unusable beyond 768x768. I can only get like 10s/it at best and the images tend to come out garbled because it's lower than 1024x1024.
Controlnet -- could be useful in conjunction with some above method, like using controlnet + ultimate sd upscale + high denoising to preserve overall composition while adding new details, but that's not viable because controlnet also causes OOM in all cases except low resolution and lowvram.
The arguments I've had best results with:
--medvram --no-half --precision full --always-batch-cond-uncond --opt-sub-quad-attention --sub-quad-q-chunk-size 512 --sub-quad-kv-chunk-size 512 --sub-quad-chunk-threshold 80 --disable-nan-check --upcast-sampling --autolaunch --use-cpu interrogate gfpgan scunet codeformer
(i copied these arguments from a discussion thread a long time ago and don't know what all of them do. at one point i definitely needed all of these just to get a1111 to launch. but maybe some of them are now deprecated?)
I also copied the setup from this comment and it didn't improve performance or avoid OOM, even though that person says they can run SDXL and Hires.fix. Bummer.
#223 (comment)
Methods I haven't tried yet:
I have not yet done exhaustive testing on trying the above options with the newer Negative Guidance Testing or Token Merging settings. I played with those a little and didn't see any difference, but maybe there are still fixes in those that I don't know about.
I have not tried the Latent Upscale extension yet.
I have not tried running this on Linux. Please do not make me install Linux; I do not have enough storage space to image my hard drive so I can make a partition safely.
I feel like there might be a way of sidestepping the problem by making a large image in photoshop, cutting it into little pieces, using img2img + controlnet to paint each piece separately, and then collaging the pieces back together. If it worked, I wouldn't mind doing it that way. But I haven't found a way of running individual tiles through img2img with any consistency.
I hope this discussion can help someone who's working with similar constraints. The fact that Ultimate SD upscale is so robust on my system makes me believe that detailed HD images are possible if I can just figure out how to break down the task into small enough parts.
Thanks to the community here for getting me this far. If anybody just wants to commiserate about how frustrating it is to only sort of be able to run SD because you don't have a cutting edge GPU, that's fine too.
Beta Was this translation helpful? Give feedback.
All reactions