Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: add new, less aggressive, sampler with ToMe #28

Open
recoilme opened this issue Apr 27, 2023 · 5 comments
Open

Suggestion: add new, less aggressive, sampler with ToMe #28

recoilme opened this issue Apr 27, 2023 · 5 comments

Comments

@recoilme
Copy link

recoilme commented Apr 27, 2023

So. Now Tome on the one hand behaves somewhat aggressively, throwing out a significant portion of tokens not directly related to the main composition.
This leads to the following disadvantages:

But there are pluses as well:

  • increased generation speed
  • allocation of the kernel of the composition

If you reduce the number of steps, we often lose the essence (because we try to generate everything at once). And this is where ToMe can help us. At the early stages of generation, using tome will reduce the amount of noise and slightly increase the speed. I assume that it will lead to a more coherent and high-quality composition as a whole. Fewer tokens means better composition.

Let's say I generate with DPM++ SDE Karass with 30 steps. Now the tome is used in the whole generation process. My propose to use it exponentially (if possible), for example,
at steps:

  • 1 - 5 - .75 compression
  • 6 - 10 - .5 compression
  • 10 - 15 - 0.25 compression
  • 16 - 30 - 0 compression

This will allow you to focus on the main part of the composition in the early stages, and increase the quality and amount of detail in the later stages.

In my opinion, a more correct method would be to add Tome as a new Sampler, ideally in DPM++ SDE Karass + tome (it is the slowest and highest quality sampler, by my taste).

What do you think?

@recoilme recoilme changed the title Suggestion: Make ToMe optimization less aggressive Suggestion: Make ToMe optimization less aggressive, and add new sampler with ToMe Apr 27, 2023
@recoilme recoilme changed the title Suggestion: Make ToMe optimization less aggressive, and add new sampler with ToMe Suggestion: add new, less aggressive, sampler with ToMe Apr 27, 2023
@dbolya
Copy link
Owner

dbolya commented Apr 27, 2023

Hi, thanks for the suggestion!

I actually tried something similar in the paper:
image

It did improve the quantitative numbers somewhat, but I found the small gain in FID was not worth the headache it is to implement. We'd need to know the current diffusion step and the total number of diffusion steps inside the model, which is not really something we have access to (unless I was missing a simple way to query that information). Thus, I omitted it from this version of the code.

Now, it could be that the results are better when used on big images (I only tested 512x512 images for that experiment). So maybe that's something to test?

Another idea is to not apply the same amount of merging to every layer. For instance, there are 4 layers ToMe is applied to right now. What if we applied more on the first layers and less on the later layers?

@recoilme
Copy link
Author

Hi, thanks for response!
Ok, at first of all we need good samples for test.

For nice pictures we need:

  • two good different type models (photorealistic, illustration)
  • right prompts (with shadows, details and so on)
  • dimension (SD trained on 512512, so we need little more, 640640 will be enougth)
  • sampler (DPM++ SDE Karass, 20 steps)

Second part, about steps.
I suggest to try to make a tome optimization inside the sampler (sampler know step i think), for example DPM++ SDE Karras. Try create a new sampler, with tome?

I will try to find more details about samplers and add minimal prompts examples

@recoilme
Copy link
Author

recoilme commented Apr 28, 2023

Prompts examples.

Minimal negative prompt is: (low quality:1.4), (worst quality:1.4)

Positive prompts must contains something like this:

highres, masterpiece, perfect lighting, bloom, cinematic lighting, <SOME CONTENT>,(masterpiece:1.3), (best_quality:1.3), (ultra_detailed:1.3), 8k, extremely_clear, realism, (ultrarealistic:1.3)

SOME CONTENT example: lion, whale, seashell, coral, clownfish, octopus

Example (lion):
highres, masterpiece, perfect lighting, bloom, cinematic lighting, lion ,(masterpiece:1.3), (best_quality:1.3), (ultra_detailed:1.3), 8k, extremely_clear, realism

Animatrix image (illustration model):
00013-0

Colorful image (photorealism):
00014-0

Steps - around 25

@recoilme
Copy link
Author

Samplers

Example of patched sampler: AUTOMATIC1111/stable-diffusion-webui#8457

Github with samplers: https://github.com/crowsonkb/k-diffusion (Katherine Crowson, @crowsonkb - developer of most famous diffusion samplers, including DPM++ SDE Karras)

May be she may get some suggestions (i'm not python dev. I just want to generate waifu)

@recoilme
Copy link
Author

And last, about "Another idea is to not apply the same amount of merging to every layer. For instance, there are 4 layers ToMe is applied to right now. What if we applied more on the first layers and less on the later layers?"

Is it about try to play with different max_downsample?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants