Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sana] Add Sana, including SanaPipeline, SanaPAGPipeline, LinearAttentionProcessor, Flow-based DPM-sovler and so on. #9982

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

lawrence-cj
Copy link
Contributor

What does this PR do?

This PR will add the official Sana (SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer) into the diffusers lib. Sana first makes the Text-to-Image available on 32x compressed latent space, powered by DC-AE(https://arxiv.org/abs/2410.10733v1) without performance degradation. Also, Sana contains several popular efficiency related techs, like DiT with Linear Attention processor and we use Decoder-only LLM (Gemma-2B-IT) for low GPU requirement and fast speed.

Paper: https://arxiv.org/abs/2410.10629
Original code repo: https://github.com/NVlabs/Sana
Project: https://nvlabs.github.io/Sana

Core contributor of DC-AE:
work with @[email protected]

Core library:

We want to collaborate on this PR together with friends from HF. Feel free to contact me here. Cc: @sayakpaul, @yiyixuxu

Core library:

HF projects:

-->

Images is generated by SanaPAGPipeline with FlowDPMSolverMultistepScheduler

5361732169697_ pic_hd

lawrence-cj and others added 30 commits October 18, 2024 17:40
# Conflicts:
#	src/diffusers/models/normalization.py
2. make style and make quality;
1. Integrate flow-dpm-sovler into diffusers;
2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;
1. add SanaPAGPipeline & several related Sana linear attention operators;
2. `SanaTransformer2DModel` not supports multi-resolution input;
2. fix the multi-scale HW bugs in SanaPipeline and SanaPAGPipeline;
3. fix the flow-dpm-solver set_timestep() init `model_output` and `lower_order_nums` bugs;
# Conflicts:
#	src/diffusers/models/__init__.py
#	src/diffusers/models/attention_processor.py
@lawrence-cj
Copy link
Contributor Author

Still need to update the DC-AE related code and checkpoint after DC-AE pr #9708 is available.

@bghira
Copy link
Contributor

bghira commented Nov 21, 2024

so you're licensing this code to fit into the Diffusers project? because the original Sana codebase is non-commercial. why is that NC but this is being opened as Apache 2.0*?

@@ -0,0 +1,69 @@
import torch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this file meant to be included?

Comment on lines +132 to +133
"DCAE",
"DCAE_HF",
Copy link
Contributor

@bghira bghira Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these the final class names? i would have thought they'd be a bit longer and more descriptive like AutoencoderDC to keep in convention with the AutoencoderKL naming

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be taking over on this PR to make the relevant changes for full integration soon, so will address this then :) LMK if there's anything particular that you'd like to see

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants