Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG #9932

painebenjamin · 2024-11-15T02:39:32Z

What does this PR do?

This PR does two things:

Adds StableDiffusion3PAGImg2ImgPipeline, tests and documentation.
Fixes a bug with SD3 + PAG in general which caused unconditional PAG generation to fail (PAGJointAttentionProcessor2_0 was failing due to it receiving attention_mask as a keyword argument.)

Before submitting

~~This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).~~
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
~~Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.~~
Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone who is interested, but particularly @yiyixuxu and @asomoza

Images

All are with SD3-Medium-Diffusers

Test output - PAG+CFG

Test output - PAG only

Docstring example output (PAG+CFG):

src/diffusers/pipelines/pag/pipeline_pag_sd_3_img2img.py

jeongiin

Hi! I found this work interesting while reading it and noticed what seemed to be a typo, so I removed it.

yiyixuxu · 2024-11-20T05:29:55Z

cc @rootonchair if you want to give a review!

HuggingFaceDocBuilderDev · 2024-11-20T05:35:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

rootonchair

I have just finished looking through the pipeline code. Just a small displacement. It seems like the style is failing. Could you run 'make style' and 'make quality' to fix it?

Good work. I will proceed reviewing the unittest later

src/diffusers/pipelines/pag/pipeline_pag_sd_3_img2img.py

Co-authored-by: Vinh H. Pham <[email protected]>

painebenjamin · 2024-11-20T22:29:25Z

I have just finished looking through the pipeline code. Just a small displacement. It seems like the style is failing. Could you run 'make style' and 'make quality' to fix it?

Good work. I will proceed reviewing the unittest later

Thank you @rootonchair, all set on the style/quality fix!

yiyixuxu · 2024-11-20T22:48:17Z

src/diffusers/models/attention_processor.py

@@ -1171,6 +1171,7 @@ def __call__(
        attn: Attention,
        hidden_states: torch.FloatTensor,
        encoder_hidden_states: torch.FloatTensor = None,
+        attention_mask: Optional[torch.FloatTensor] = None,


why do we add this here? it is not used, no?

That's the second part of what I wrote above - when using SD3+PAG and foregoing CFG (e.g. calling a PAG pipeline with guidance_scale=0,) PAGJointAttnProcessor2_0 is used instead of PAGCFGJointAttnProcessor2_0, and the following error is produced:

StableDiffusion3PAGImg2ImgPipelineIntegrationTests.test_pag_uncond __________________________________________________ self = <tests.pipelines.pag.test_pag_sd3_img2img.StableDiffusion3PAGImg2ImgPipelineIntegrationTests testMethod=test_pag_uncond> def test_pag_uncond(self): pipeline = AutoPipelineForImage2Image.from_pretrained( self.repo_id, enable_pag=True, torch_dtype=torch.float16, pag_applied_layers=["blocks.(4|17)"] ) pipeline.enable_model_cpu_offload() pipeline.set_progress_bar_config(disable=None) inputs = self.get_inputs(torch_device, guidance_scale=0.0, pag_scale=1.8) > image = pipeline(**inputs).images tests/pipelines/pag/test_pag_sd3_img2img.py:261: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../miniconda3/envs/taproot/lib/python3.10/site-packages/torch/utils/_contextlib.py:116: in decorate_context return func(*args, **kwargs) src/diffusers/pipelines/pag/pipeline_pag_sd_3_img2img.py:975: in __call__ noise_pred = self.transformer( ../miniconda3/envs/taproot/lib/python3.10/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl return self._call_impl(*args, **kwargs) ../miniconda3/envs/taproot/lib/python3.10/site-packages/torch/nn/modules/module.py:1562: in _call_impl return forward_call(*args, **kwargs) ../miniconda3/envs/taproot/lib/python3.10/site-packages/accelerate/hooks.py:170: in new_forward output = module._old_forward(*args, **kwargs) src/diffusers/models/transformers/transformer_sd3.py:346: in forward encoder_hidden_states, hidden_states = block( ../miniconda3/envs/taproot/lib/python3.10/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl return self._call_impl(*args, **kwargs) ../miniconda3/envs/taproot/lib/python3.10/site-packages/torch/nn/modules/module.py:1562: in _call_impl return forward_call(*args, **kwargs) src/diffusers/models/attention.py:208: in forward attn_output, context_attn_output = self.attn( ../miniconda3/envs/taproot/lib/python3.10/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl return self._call_impl(*args, **kwargs) ../miniconda3/envs/taproot/lib/python3.10/site-packages/torch/nn/modules/module.py:1562: in _call_impl return forward_call(*args, **kwargs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = Attention( (to_q): Linear(in_features=1536, out_features=1536, bias=True) (to_k): Linear(in_features=1536, out_fea...ue) (1): Dropout(p=0.0, inplace=False) ) (to_add_out): Linear(in_features=1536, out_features=1536, bias=True) ) hidden_states = tensor([[[-0.0430, -3.7031, 0.2078, ..., 0.3115, 0.0703, 0.0383], [ 0.0179, -2.4727, 0.1594, ..., -0.0..., [-0.0490, -0.3691, 0.2568, ..., -1.0303, -0.0298, 0.5527]]], device='cuda:0', dtype=torch.float16) encoder_hidden_states = tensor([[[-0.0503, 0.0515, -0.0623, ..., -0.0044, -0.0186, -0.0752], [ 0.1860, -0.2595, 0.0835, ..., 0.1..., [ 0.6958, -0.4875, -0.1246, ..., 0.2664, -0.1700, 0.0030]]], device='cuda:0', dtype=torch.float16) attention_mask = None, cross_attention_kwargs = {}, unused_kwargs = [] def forward( self, hidden_states: torch.Tensor, encoder_hidden_states: Optional[torch.Tensor] = None, attention_mask: Optional[torch.Tensor] = None, **cross_attention_kwargs, ) -> torch.Tensor: r""" The forward method of the `Attention` class. Args: hidden_states (`torch.Tensor`): The hidden states of the query. encoder_hidden_states (`torch.Tensor`, *optional*): The hidden states of the encoder. attention_mask (`torch.Tensor`, *optional*): The attention mask to use. If `None`, no mask is applied. **cross_attention_kwargs: Additional keyword arguments to pass along to the cross attention. Returns: `torch.Tensor`: The output of the attention layer. """ # The `Attention` class can call different attention processors / attention functions # here we simply pass along all tensors to the selected processor class # For standard processors that are defined here, `**cross_attention_kwargs` is empty attn_parameters = set(inspect.signature(self.processor.__call__).parameters.keys()) quiet_attn_parameters = {"ip_adapter_masks"} unused_kwargs = [ k for k, _ in cross_attention_kwargs.items() if k not in attn_parameters and k not in quiet_attn_parameters ] if len(unused_kwargs) > 0: logger.warning( f"cross_attention_kwargs {unused_kwargs} are not expected by {self.processor.__class__.__name__} and will be ignored." ) cross_attention_kwargs = {k: w for k, w in cross_attention_kwargs.items() if k in attn_parameters} > return self.processor( self, hidden_states, encoder_hidden_states=encoder_hidden_states, attention_mask=attention_mask, **cross_attention_kwargs, ) E TypeError: PAGJointAttnProcessor2_0.__call__() got an unexpected keyword argument 'attention_mask' src/diffusers/models/attention_processor.py:530: TypeError ======================================================================= short test summary info ======================================================================= FAILED tests/pipelines/pag/test_pag_sd3_img2img.py::StableDiffusion3PAGImg2ImgPipelineIntegrationTests::test_pag_uncond - TypeError: PAGJointAttnProcessor2_0.__call__() got an unexpected keyword argument 'attention_mask'

An alternative to adding this particular keyword argument would be to catch all other keyword arguments with **kwargs, which there is precedent for in other attention processors, but I generally default to being more restrictive and not less. For whatever it's worth, PAGCFGJointAttnProcessor2_0 does both of those things; it captures attention_mask and does nothing with it, and also has *args and **kwargs.

If there is any particular way that you think is the most in-line with the rest of the codebase, I'll be happy to adjust.

rootonchair · 2024-11-21T09:56:49Z

FAILED tests/pipelines/pag/test_pag_sd3_img2img.py::StableDiffusion3PAGImg2ImgPipelineFastTests::test_pag_inference - AssertionError: 0.0720449086320496 not less than or equal to 0.001

@painebenjamin could you update the expected values?

painebenjamin · 2024-11-22T04:05:07Z

FAILED tests/pipelines/pag/test_pag_sd3_img2img.py::StableDiffusion3PAGImg2ImgPipelineFastTests::test_pag_inference - AssertionError: 0.0720449086320496 not less than or equal to 0.001

@painebenjamin could you update the expected values?

All set! Something was off in my environment, but restarting with a fresh conda env got my machine reproducing the same values as the failed run. I also re-checked the slow tests and the CFG one was off in the fresh environment so I fixed that one too, and re-ran make style and make quality.

painebenjamin and others added 7 commits November 13, 2024 20:52

fix progress bar updates in SD 1.5 PAG Img2Img pipeline

adefc1b

Merge branch 'huggingface:main' into main

4bb5b73

catch attention mask in PAG-only attention processor for SD3 pipelines

47563bc

Merge branch 'main' of github.com:painebenjamin/diffusers

63a31d5

Add SD3PAGImg2Img Pipeline and tests

d49eb5b

add pipeline to docs and correct documentation, ruff

399f6cd

add autogenerated stub

38c2c7b

jeongiin reviewed Nov 15, 2024

View reviewed changes

src/diffusers/pipelines/pag/pipeline_pag_sd_3_img2img.py Outdated Show resolved Hide resolved

jeongiin reviewed Nov 15, 2024

View reviewed changes

painebenjamin and others added 5 commits November 15, 2024 08:22

remove typo

357beb1

accidental delete!

628bbbf

Merge branch 'main' into main

f4448b3

Merge branch 'main' into main

7f1058a

Merge branch 'main' into main

2a8aa20

rootonchair reviewed Nov 20, 2024

View reviewed changes

src/diffusers/pipelines/pag/pipeline_pag_sd_3_img2img.py Show resolved Hide resolved

src/diffusers/pipelines/pag/pipeline_pag_sd_3_img2img.py Outdated Show resolved Hide resolved

painebenjamin and others added 5 commits November 20, 2024 17:14

Merge branch 'main' into main

de3afaf

Update src/diffusers/pipelines/pag/pipeline_pag_sd_3_img2img.py

6ed9607

Co-authored-by: Vinh H. Pham <[email protected]>

Update src/diffusers/pipelines/pag/pipeline_pag_sd_3_img2img.py

d49cbc9

Co-authored-by: Vinh H. Pham <[email protected]>

style changes as suggested

66c2ab4

Merge branch 'main' of github.com:painebenjamin/diffusers

1d192dd

yiyixuxu reviewed Nov 20, 2024

View reviewed changes

yiyixuxu added the close-to-merge label Nov 20, 2024

painebenjamin and others added 2 commits November 21, 2024 12:33

Merge branch 'main' into main

664d47b

update expected values

3736af3

Merge branch 'main' into main

d35f725

Merge branch 'main' into main

43c34aa

painebenjamin requested a review from rootonchair November 26, 2024 13:56

Merge branch 'main' into main

561844b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG #9932

Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG #9932

painebenjamin commented Nov 15, 2024

jeongiin left a comment

yiyixuxu commented Nov 20, 2024

HuggingFaceDocBuilderDev commented Nov 20, 2024

rootonchair left a comment

painebenjamin commented Nov 20, 2024

yiyixuxu Nov 20, 2024

painebenjamin Nov 21, 2024 •

edited

Loading

rootonchair commented Nov 21, 2024

painebenjamin commented Nov 22, 2024 •

edited

Loading

Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG #9932

Are you sure you want to change the base?

Add StableDiffusion3PAGImg2Img Pipeline + Fix SD3 Unconditional PAG #9932

Conversation

painebenjamin commented Nov 15, 2024

What does this PR do?

Before submitting

Who can review?

Images

jeongiin left a comment

Choose a reason for hiding this comment

yiyixuxu commented Nov 20, 2024

HuggingFaceDocBuilderDev commented Nov 20, 2024

rootonchair left a comment

Choose a reason for hiding this comment

painebenjamin commented Nov 20, 2024

yiyixuxu Nov 20, 2024

Choose a reason for hiding this comment

painebenjamin Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

rootonchair commented Nov 21, 2024

painebenjamin commented Nov 22, 2024 • edited Loading

painebenjamin Nov 21, 2024 •

edited

Loading

painebenjamin commented Nov 22, 2024 •

edited

Loading