[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

lawrence-cj · 2024-11-21T06:16:57Z

What does this PR do?

This PR will add the official Sana (SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer) into the diffusers lib. Sana first makes the Text-to-Image available on 32x compressed latent space, powered by DC-AE(https://arxiv.org/abs/2410.10733v1) without performance degradation. Also, Sana contains several popular efficiency related techs, like DiT with Linear Attention processor and we use Decoder-only LLM (Gemma-2B-IT) for low GPU requirement and fast speed.

Paper: https://arxiv.org/abs/2410.10629
Original code repo: https://github.com/NVlabs/Sana
Project: https://nvlabs.github.io/Sana

Core contributor of DC-AE:
work with @[email protected]

Core library:

We want to collaborate on this PR together with friends from HF. Feel free to contact me here. Cc: @sayakpaul, @yiyixuxu

Core library:

Schedulers: @yiyixuxu
Pipelines and pipeline callbacks: @yiyixuxu and @asomoza
Docs: @stevhliu and @sayakpaul
General functionalities: @sayakpaul @yiyixuxu @DN6

HF projects:

transformers: different repo
safetensors: different repo

-->

Images is generated by `SanaPAGPipeline` with `FlowDPMSolverMultistepScheduler`

# Conflicts: # src/diffusers/models/normalization.py

2. make style and make quality;

1. Integrate flow-dpm-sovler into diffusers; 2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;

1. add SanaPAGPipeline & several related Sana linear attention operators; 2. `SanaTransformer2DModel` not supports multi-resolution input; 2. fix the multi-scale HW bugs in SanaPipeline and SanaPAGPipeline; 3. fix the flow-dpm-solver set_timestep() init `model_output` and `lower_order_nums` bugs;

# Conflicts: # src/diffusers/models/__init__.py # src/diffusers/models/attention_processor.py

lawrence-cj · 2024-11-21T06:19:02Z

Still need to update the DC-AE related code and checkpoint after DC-AE pr #9708 is available.

bghira · 2024-11-21T13:40:29Z

so you're licensing this code to fit into the Diffusers project? because the original Sana codebase is non-commercial. why is that NC but this is being opened as Apache 2.0*?

bghira · 2024-11-23T15:04:40Z

sana.py

@@ -0,0 +1,69 @@
+import torch


is this file meant to be included?

bghira · 2024-11-23T15:05:39Z

src/diffusers/__init__.py

+            "DCAE",
+            "DCAE_HF",


are these the final class names? i would have thought they'd be a bit longer and more descriptive like AutoencoderDC to keep in convention with the AutoencoderKL naming

I will be taking over on this PR to make the relevant changes for full integration soon, so will address this then :) LMK if there's anything particular that you'd like to see

lawrence-cj and others added 30 commits October 18, 2024 17:40

first add a script for DC-AE;

6e616a9

Merge remote-tracking branch 'upstream/main' into DC-AE

d2e187a

DC-AE init

90e8939

replace triton with custom implementation

825c975

1. rename file and remove un-used codes;

3a44fa4

no longer rely on omegaconf and dataclass

55b2615

merge

6fb7fdb

Merge remote-tracking branch 'upstream/main' into DC-AE

c323e76

replace custom activation with diffuers activation

da7caa5

remove dc_ae attention in attention_processor.py

fb6d92a

iinherit from ModelMixin

5e63a1a

inherit from ConfigMixin

72cce2b

dc-ae reduce to one file

8f9b4e4

Merge remote-tracking branch 'upstream/main' into DC-AE

b7f68f9

Merge branch 'huggingface:main' into DC-AE

6d96b95

Merge remote-tracking branch 'refs/remotes/origin/main' into DC-AE

3c3cc51

# Conflicts: # src/diffusers/models/normalization.py

1. add DCAE into diffusers;

3b18ef4

2. make style and make quality;

add DCAE_HF into diffusers;

a62bd75

bug fixed;

09c6c00

add SanaPipeline, SanaTransformer2D into diffusers;

b9741af

add sanaLinearAttnProcessor2_0;

4df1722

first update for SanaTransformer;

8a7b24d

first update for SanaPipeline;

d1b4834

first success run SanaPipeline;

2416c77

model output finally match with original model with the same intput;

ce24e41

code update;

e7193b4

code update;

e78bdb5

code update;

a1ef876

add a flow dpm-solver scripts

c93de40

🎉[important update]

7d8a0e8

1. Integrate flow-dpm-sovler into diffusers; 2. finally run successfully on both `FlowMatchEulerDiscreteScheduler` and `FlowDPMSolverMultistepScheduler`;

lawrence-cj added 3 commits November 15, 2024 04:06

remove prints;

38af130

Merge branch 'refs/heads/main' into DC-AE-Sana

5ef535e

# Conflicts: # src/diffusers/models/__init__.py # src/diffusers/models/attention_processor.py

add convert sana official checkpoint to diffusers format Safetensor.

d00f1cd

lawrence-cj mentioned this pull request Nov 22, 2024

Safetensors? NVlabs/Sana#28

Open

bghira reviewed Nov 23, 2024

View reviewed changes

sana.py

@@ -0,0 +1,69 @@

import torch

Copy link

Contributor

bghira Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this file meant to be included?

bghira reviewed Nov 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

lawrence-cj commented Nov 21, 2024

lawrence-cj commented Nov 21, 2024

bghira commented Nov 21, 2024 •

edited

Loading

bghira Nov 23, 2024

bghira Nov 23, 2024 •

edited

Loading

a-r-r-o-w Nov 23, 2024

[Sana] Add Sana, including SanaPipeline, SanaPAGPipeline, LinearAttentionProcessor, Flow-based DPM-sovler and so on. #9982

Are you sure you want to change the base?

[Sana] Add Sana, including SanaPipeline, SanaPAGPipeline, LinearAttentionProcessor, Flow-based DPM-sovler and so on. #9982

Conversation

lawrence-cj commented Nov 21, 2024

What does this PR do?

Images is generated by SanaPAGPipeline with FlowDPMSolverMultistepScheduler

lawrence-cj commented Nov 21, 2024

bghira commented Nov 21, 2024 • edited Loading

bghira Nov 23, 2024

Choose a reason for hiding this comment

bghira Nov 23, 2024 • edited Loading

Choose a reason for hiding this comment

a-r-r-o-w Nov 23, 2024

Choose a reason for hiding this comment

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

[Sana] Add Sana, including `SanaPipeline`, `SanaPAGPipeline`, `LinearAttentionProcessor`, `Flow-based DPM-sovler` and so on. #9982

Images is generated by `SanaPAGPipeline` with `FlowDPMSolverMultistepScheduler`

bghira commented Nov 21, 2024 •

edited

Loading

bghira Nov 23, 2024 •

edited

Loading