-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: add new, less aggressive, sampler with ToMe #28
Comments
Hi, thanks for the suggestion! I actually tried something similar in the paper: It did improve the quantitative numbers somewhat, but I found the small gain in FID was not worth the headache it is to implement. We'd need to know the current diffusion step and the total number of diffusion steps inside the model, which is not really something we have access to (unless I was missing a simple way to query that information). Thus, I omitted it from this version of the code. Now, it could be that the results are better when used on big images (I only tested 512x512 images for that experiment). So maybe that's something to test? Another idea is to not apply the same amount of merging to every layer. For instance, there are 4 layers ToMe is applied to right now. What if we applied more on the first layers and less on the later layers? |
Hi, thanks for response! For nice pictures we need:
Second part, about steps. I will try to find more details about samplers and add minimal prompts examples |
Samplers Example of patched sampler: AUTOMATIC1111/stable-diffusion-webui#8457 Github with samplers: https://github.com/crowsonkb/k-diffusion (Katherine Crowson, @crowsonkb - developer of most famous diffusion samplers, including DPM++ SDE Karras) May be she may get some suggestions (i'm not python dev. I just want to generate waifu) |
And last, about "Another idea is to not apply the same amount of merging to every layer. For instance, there are 4 layers ToMe is applied to right now. What if we applied more on the first layers and less on the later layers?" Is it about try to play with different max_downsample? |
So. Now Tome on the one hand behaves somewhat aggressively, throwing out a significant portion of tokens not directly related to the main composition.
This leads to the following disadvantages:
But there are pluses as well:
If you reduce the number of steps, we often lose the essence (because we try to generate everything at once). And this is where ToMe can help us. At the early stages of generation, using tome will reduce the amount of noise and slightly increase the speed. I assume that it will lead to a more coherent and high-quality composition as a whole. Fewer tokens means better composition.
Let's say I generate with DPM++ SDE Karass with 30 steps. Now the tome is used in the whole generation process. My propose to use it exponentially (if possible), for example,
at steps:
This will allow you to focus on the main part of the composition in the early stages, and increase the quality and amount of detail in the later stages.
In my opinion, a more correct method would be to add Tome as a new Sampler, ideally in DPM++ SDE Karass + tome (it is the slowest and highest quality sampler, by my taste).
What do you think?
The text was updated successfully, but these errors were encountered: