-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sana] Add Sana, including SanaPipeline
, SanaPAGPipeline
, LinearAttentionProcessor
, Flow-based DPM-sovler
and so on.
#9982
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # src/diffusers/models/normalization.py
2. make style and make quality;
1. Integrate flow-dpm-sovler into diffusers; 2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;
1. add SanaPAGPipeline & several related Sana linear attention operators; 2. `SanaTransformer2DModel` not supports multi-resolution input; 2. fix the multi-scale HW bugs in SanaPipeline and SanaPAGPipeline; 3. fix the flow-dpm-solver set_timestep() init `model_output` and `lower_order_nums` bugs;
# Conflicts: # src/diffusers/models/__init__.py # src/diffusers/models/attention_processor.py
Still need to update the DC-AE related code and checkpoint after DC-AE pr #9708 is available. |
so you're licensing this code to fit into the Diffusers project? because the original Sana codebase is non-commercial. why is that NC but this is being opened as Apache 2.0*? |
@@ -0,0 +1,69 @@ | |||
import torch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this file meant to be included?
"DCAE", | ||
"DCAE_HF", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these the final class names? i would have thought they'd be a bit longer and more descriptive like AutoencoderDC to keep in convention with the AutoencoderKL naming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will be taking over on this PR to make the relevant changes for full integration soon, so will address this then :) LMK if there's anything particular that you'd like to see
What does this PR do?
This PR will add the official Sana (SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer) into the diffusers lib. Sana first makes the Text-to-Image available on 32x compressed latent space, powered by DC-AE(https://arxiv.org/abs/2410.10733v1) without performance degradation. Also, Sana contains several popular efficiency related techs, like DiT with Linear Attention processor and we use Decoder-only LLM (Gemma-2B-IT) for low GPU requirement and fast speed.
Paper: https://arxiv.org/abs/2410.10629
Original code repo: https://github.com/NVlabs/Sana
Project: https://nvlabs.github.io/Sana
Core contributor of DC-AE:
work with @[email protected]
Core library:
We want to collaborate on this PR together with friends from HF. Feel free to contact me here. Cc: @sayakpaul, @yiyixuxu
Core library:
HF projects:
-->
Images is generated by
SanaPAGPipeline
withFlowDPMSolverMultistepScheduler