-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we get more schedulers for flow based models such as SD3, SD3.5, and flux #9924
Comments
I'm glad this shortcoming has been brought to light. With that in mind, I have been testing a new diffuser scheduler which is adapted from the current diffuser / schedulers library (scheduling_dpmsolver_multistep.py) along with this: https://github.com/crowsonkb/k-diffusion/blob/master/k_diffusion/sampling.py Note: I have configured my diffusers library with the updated scheduler and importing normally for my testing. Create the pipeline as you normally would then append the scheduler with the required parameters: SD3: FLUX: |
From the import clauses txt file, it seems the classes are already under diffusers repo for the testing runs. It seems to be suitable to include the file to the official diffusers repo? |
Some of you can update your diffusers library to incorporate the scheduler I proposed (which is a first start) for independent testing (as I have). The scheduler as written can be incorporated into the diffuser library w/o change and will work with the current FLUX and SD3 pipelines. |
I have a pending PR which has not been merged: Once above merges, I can work on sending another PR related to schedulers. In the meanwhile, anyone else are welcomed to raise a pull request. |
also related #9607 |
@ukaprch why not create a pr to include this in diffusers? anyhow, here are some sample grids (including original euler and heun flowmatch for reference) auraflow: also tried with this model since its also flow-match based and it works |
bit more testing, karras and exponential are working fine for flux, but not for sd35 actual params used:
|
OK, I noticed you are using a shift: 1. This should be shift: 3 which is the std for SD3. I'll look into some more. After running my test the only noticeable thing I see is a more pronounced 'bokeh' effect which I wouldn't necessarily characterize as blur per se. I do think the sigmas / timesteps for karras, exponential and lambdas for SD 3.5 could possibly use some further refinement. image = pipe( scheduler config: ◢ | config | OrderedDict([('num_train_timesteps', 1000), ('solver_order', 2), ('thresholding', False), ('dynamic_thresholding_ratio', 0.995), ('sample_max_value', 1.0), ('algorithm_type', 'dpmsolver++2Msde'), ('solver_type', 'midpoint'), ('sigma_schedule', None), ('shift', 3.0), ('midpoint_ratio', 0.5), ('s_noise', 1), ('use_noise_sampler', True), ('use_SD35_sigmas', True), ('use_dynamic_shifting', False), ('base_shift', 0.5), ('max_shift', 1.15), ('base_image_seq_len', 256), ('max_image_seq_len', 4096), ('_use_default_values', ['dynamic_thresholding_ratio', 'base_image_seq_len', 'midpoint_ratio', 'solver_type', 'thresholding', 'base_shift', 'max_image_seq_len', 'sample_max_value'])]) | FrozenDict | algorithm_type | 'dpmsolver++2Msde' | str |
I found with FLUX that you need generally 25 or more steps to get good images. Remember, we're not using the 3 step method as previously with SDXL. Your images look good BTW. It really comes into its own b/t 35 - 50 steps. BTW if you increase the shift factor from 3 to 3.5 you get a bit more contrast in the image which may or may not be to everyone's liking. |
actually, the worst-case scenario here is sd35-medium ;) re: shift - i've noticed. i have it as user-configurable item
definitely! from a quick look at the code, 99% is clean. i wish if |
vladmandic: As noted, this scheduler is not designed for flow match derived sampling. I've attached a new FlowMatch version of the original: scheduling_dpmsolver_multistep.py (scheduling_flow_match_dpmsolver_multistep_orig.py)
scheduling_flow_match_dpmsolver_multistep.txt Please do some more testing on both and state any concerns, etc. I think we are very close on this and the community would definitely benefit from both. Example images using same generation data. Left is new proposed version of scheduling_flow_match_dpmsolver_multistep.py and right is current scheduling_flow_match_euler_discrete.py both used 25 steps: |
Let me know if it is OK that I send out a PR includes the script you offered, if you can do it, I will not send out the PR. |
Please do go ahead and send out the PR. I believe we're ready to go. As I said previously, the folks in charge can go over them and make any necessary adjustments they feel are necessary, but I do think we are ready to proceed. Thanks again. |
@ukaprch i agree with all of your comments, but i feel like something went wrong in the last version? generated using sd35-medium with 50 steps and example config i'm using: {'num_train_timesteps': 1000, 'beta_start': 0.0001, 'beta_end': 0.02, 'beta_schedule': 'linear', 'shift': 3, 'use_dynamic_shifting': False, 'solver_order': 2, 'sigma_schedule': 'lambdas', 'use_beta_sigmas': True, 'algorithm_type': 'dpmsolver2', 'use_noise_sampler': True} |
also, with flux and sigma_method=betas, i'm getting index-out-of-range at
|
sigma_next = self.sigmas[self.step_index + 1] I was unable to replicate your error "Flux using "beta" sigma_schedule" using this config with the scheduling_flow_match_dpmsolver_multistep.py ◢ | config | OrderedDict([('num_train_timesteps', 1000), ('beta_start', 0.0001), ('beta_end', 0.02), ('beta_schedule', 'linear'), ('trained_betas', None), ('solver_order', 2), ('algorithm_type', 'dpmsolver2'), ('solver_type', 'midpoint'), ('sigma_schedule', 'betas'), ('shift', 3.0), ('midpoint_ratio', 0.5), ('s_noise', 1), ('use_noise_sampler', True), ('use_beta_sigmas', True), ('use_dynamic_shifting', True), ('base_shift', 0.5), ('max_shift', 1.15), ('base_image_seq_len', 256), ('max_image_seq_len', 4096), ('_use_default_values', ['trained_betas', 'solver_type', 'solver_order'])]) | FrozenDict Also just so we're on the same page: Maybe some confusion on this but sigma_schedule = "betas" is not the same thing as self.config.sigma_schedule == "betas"
self.config.sigma_schedule == None
What did I miss? |
i know - use_beta_sigmas is to make sd35 happy. sigma_schedule=betas is new sigma method that you didn't have before. |
Yes. The degradation problem is most probably due to the way FlowMatch approaches the problem. The sigmas / timesteps are vastly compressed under FlowMatch which negates their usability. More steps alleviates the problem at a cost. This cannot be stressed enough for new / existing users that use these type schedulers. I'm very happy with most of the results I've achieved using them. As for SD3.5 Large, it more resembles SDXL than it does FLUX. Not sure why they chose (3) text encoders. Without the beta sigmas it would be far worse. I've also played around with using dynamic shifting with SD3.5 large. It requires you to add the required functionality to the pipeline to make it work. Something to think about I guess. |
I believe you have studied this extensively. Can we add some best usage pattern to the code, they can be comments on top of those schedulers, and they can be: """ scheduler = .... (such as with what beta setting, etc) """ |
After the most recommended usage pattern is added, I plan to send out a PR. |
@ukaprch i agree with your thoughts on sd35, especially on their choice of both clip-l and clip-g for no reason. |
Can you provide me all the parameters you used for the above images so I can see? |
sd35-medium, steps=50, nothing out of ordinary with the rest of generate params.
|
I needed to setup my environment for sd35 Medium which I never tested. Having done that, using your same parameters I ran both sd35 Medium & Large for 30 and 40 steps respectively and encountered no problems: Did something change in your environment when you updated? Did (perhaps) any code get misaligned in upgrading and not running the same? As an aside, those using sd 35 should be aware that these models will not run well using FLUX aspect ratios and sizes. Case in point, I ran sd 35 Large using aspect ratio 2:3 with size: 1152 x 1728. As you can see in the image below both the top and bottom of the image contain artifacts reflecting problems in generating this size image. I also ran into the same problem if I ran an image with aspect ratio: 1:1 size: 1408 X 1408 which FLUX can easily handle. |
It seems advanced schedulers such as DDIM, and the dpm++ 2m does work with flow based model such as SD3, SD3.5, and flux.
However, I only see 2 flow based schedulers in diffusers codebase:
FlowMatchEulerDiscreteScheduler, and'
FlowMatchHeunDiscreteScheduler
I tried to use DPMSolverMultistepScheduler, but it does not generate correct images with flow based models. Help?
The text was updated successfully, but these errors were encountered: