Merge changes #155

Skquark · 2024-04-21T21:09:27Z

No description provided.

* 7529 do not disable autocast for cuda devices * Remove typecasting error check for non-mps platforms, as a correct autocast implementation makes it a non-issue * add autocast fix to other training examples * disable native_amp for dreambooth (sdxl) * disable native_amp for pix2pix (sdxl) * remove tests from remaining files * disable native_amp on huggingface accelerator for every training example that uses it * convert more usages of autocast to nullcontext, make style fixes * make style fixes * style. * Empty-Commit --------- Co-authored-by: bghira <[email protected]> Co-authored-by: Sayak Paul <[email protected]>

* add: utility to format our docs too 📜 * debugging saga * fix: message * checking * should be fixed. * revert pipeline_fixture * remove empty line * make style * fix: setup.py * style.

* UniPC UTs iterate solvers on FP16 It wasn't catching errs on order==3. Might be excessive? * UniPC Multistep fix tensor dtype/device on order=3 * UniPC UTs Add v_pred to fp16 test iter For completions sake. Probably overkill?

* UniPC Multistep add `rescale_betas_zero_snr` Same patch as DPM and Euler with the patched final alpha cumprod BF16 doesn't seem to break down, I think cause UniPC upcasts during some phases already? We could still force an upcast since it only loses ≈ 0.005 it/s for me but the difference in output is very small. A better endeavor might upcasting in step() and removing all the other upcasts elsewhere? * UniPC ZSNR UT * Re-add `rescale_betas_zsnr` doc oops

* refactor transformers 2d into multiple legacy variants. * fix: init. * fix recursive init. * add inits. * make transformer block creation more modular. * complete refactor. * remove forward * debug * remove legacy blocks and refactor within the module itself. * remove print * guard caption projection * remove fetcher. * reduce the number of args. * fix: norm_type * group variables that are shared. * remove _get_transformer_blocks * harmonize the init function signatures. * transformer_blocks to common * repeat .

* increase number of workers for the tests. * move to beefier runner. * improve the fast push tests too. * use a beefy machine for pytorch pipeline tests * up the number of workers further.

* Update pipeline_animatediff_video2video.py * commit with test for whether latent input can be passed into animatediffvid2vid

* Skip `test_freeu_enabled ` on MPS * Small fixes - import skip_mps correctly - disable all instances of test_freeu_enabled * Empty commit to trigger tests * Empty commit to trigger CI

* reduce block sizes for unet1d. * reduce blocks for unet_2d. * reduce block size for unet_motion * increase channels. * correctly increase channels. * reduce number of layers in unet2dconditionmodel tests. * reduce block sizes for unet2dconditionmodel tests * reduce block sizes for unet3dconditionmodel. * fix: test_feed_forward_chunking * fix: test_forward_with_norm_groups * skip spatiotemporal tests on MPS. * reduce block size in AutoencoderKL. * reduce block sizes for vqmodel. * further reduce block size. * make style. * Empty-Commit * reduce sizes for ConsistencyDecoderVAETests * further reduction. * further block reductions in AutoencoderKL and AssymetricAutoencoderKL. * massively reduce the block size in unet2dcontionmodel. * reduce sizes for unet3d * fix tests in unet3d. * reduce blocks further in motion unet. * fix: output shape * add attention_head_dim to the test configuration. * remove unexpected keyword arg * up a bit. * groups. * up again * fix

add set_begin_index for all if pipelines

* add audioldm2 tts * change gpt2 max new tokens * remove unnecessary pipeline and class * add TTS to AudioLDM2Pipeline * add TTS docs * delete unnecessary file * remove unnecessary import * add audioldm2 slow testcase * fix code quality * remove AudioLDMLearnablePositionalEmbedding * add variable check vits encoder * add use_learned_position_embedding --------- Co-authored-by: Dhruv Nair <[email protected]>

Allow safety and feature extractor arguments to be passed to convert_from_ckpt Allows management of safety checker and feature extractor from outside of the convert ckpt class. Co-authored-by: Sayak Paul <[email protected]>

* Restore unet params back to normal from EMA when validation call is finished * empty commit --------- Co-authored-by: Sayak Paul <[email protected]>

* disable test * update --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>

* Support multiimage masking --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

* add utilities for updating diffusers pipeline metadata. * style * remove first empty line

…ions. (#7489) * refactor transformer_2d forward logic into meaningful conditions. * Empty-Commit * fix: _operate_on_patched_inputs * fix: _operate_on_patched_inputs * check * fix: patch output computation block. * fix: _operate_on_patched_inputs. * remove print. * move operations to blocks. * more readability neats. * empty commit * Apply suggestions from code review Co-authored-by: Dhruv Nair <[email protected]> * Revert "Apply suggestions from code review" This reverts commit 12178b1. --------- Co-authored-by: Dhruv Nair <[email protected]>

…m workflows (#7543) * remove libsndfile1-dev and libgl1 from workflows and ensure that re present in the respective dockerfiles. * change to self-hosted runner; let's see 🤞 * add libsndfile1-dev libgl1 for now * use self-hosted runners for building and push too.

* get device <-> component mapping when using multiple gpus. * condition the device_map bits. * relax condition * device_map progress. * device_map enhancement * some cleaning up and debugging * Apply suggestions from code review Co-authored-by: Marc Sun <[email protected]> * incorporate suggestions from PR. * remove multi-gpu condition for now. * guard check the component -> device mapping * fix: device_memory variable * dispatching transformers model to have force_hooks=True * better guarding for transformers device_map * introduce support balanced_low_memory and balanced_ultra_low_memory. * remove device_map patch. * fix: intermediate variable scoping. * fix: condition in cpu offload. * fix: flax class restrictions. * remove modifications from cpu_offload and model_offload * incorporate changes. * add a simple forward pass test * add: torch_device in get_inputs() * add: tests * remove print * safe-guard to(), model offloading and cpu offloading when balanced is used as a device_map. * style * remove . * safeguard device_map with more checks and remove invalid device_mapping strategues. * make a class attribute and adjust tests accordingly. * fix device_map check * fix test * adjust comment * fix: device_map attribute * fix: dispatching. * max_memory test for pipeline * version guard the tests * fix guard. * address review feedback. * reset_device_map method. * add: test for reset_hf_device_map * fix a couple things. * add reset_device_map() in the error message. * add tests for checking reset_device_map doesn't have unintended consequences. * fix reset_device_map and offloading tests. * create _get_final_device_map utility. * hf_device_map -> _hf_device_map * add documentation * add notes suggested by Marc. * styling. * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> * move updates within gpu condition. * other docs related things * note on ignore a device not specified in . * provide a suggestion if device mapping errors out. * fix: typo. * _hf_device_map -> hf_device_map * Empty-Commit * add: example hf_device_map. --------- Co-authored-by: Marc Sun <[email protected]> Co-authored-by: Steven Liu <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]>

Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

remove duplicate tip block.

…in examples (#7603) * Modularize instruct_pix2pix code * quality check * quality check --------- Co-authored-by: Sayak Paul <[email protected]>

* give it a shot. * print. * correct assertion. * gather results from the rest of the tests. * change the assertion values where needed. * remove print statements.

* prompt enhance * edits * align titles * feedback * feedback * feedback * link to style

* refactor t2i * add code snippets

* fix * up --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>

* playground vae encoding should use std and mean of the vae. * style. * fix-copies.

* Skip scaling if scale is identity * move check for weight one to scale and unscale lora * fix code style/quality * Empty-Commit --------- Co-authored-by: Steven Munn <[email protected]> Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Steven Munn <[email protected]>

* Initialize target_unet from unet rather than teacher_unet so that we correctly add time_embedding.cond_proj if necessary. * Use UNet2DConditionModel.from_config to initialize target_unet from unet's config. --------- Co-authored-by: Sayak Paul <[email protected]>

Fixed deprecated logger.warn with logger.warning.

Fix a bug that causes the the call to set_lora_device to ignore the DoRA parameters.

…ts (#7527) * add scheduled pseudo-huber loss training scripts See #7488 * add reduction modes to huber loss * [DB Lora] *2 multiplier to huber loss cause of 1/2 a^2 conv. pairing of kohya-ss/sd-scripts@c6495de * [DB Lora] add option for smooth l1 (huber / delta) Pairing of kohya-ss/sd-scripts@dd22958 * [DB Lora] unify huber scheduling Pairing of kohya-ss/sd-scripts@19a834c * [DB Lora] add snr huber scheduler Pairing of kohya-ss/sd-scripts@47fb1a6 * fixup examples link * use snr schedule by default in DB * update all huber scripts with snr * code quality * huber: make style && make quality --------- Co-authored-by: Sayak Paul <[email protected]>

* CheckIn - created DownSubBlocks * Added extra channels, implemented subblock fwd * Fixed connection sizes * checkin * Removed iter, next in forward * Models for SD21 & SDXL run through * Added back pipelines, cleared up connections * Cleaned up connection creation * added debug logs * updated logs * logs: added input loading * Update umer_debug_logger.py * log: Loading hint * Update umer_debug_logger.py * added logs * Changed debug logging * debug: added more logs * Fixed num_norm_groups * Debug: Logging all of SDXL input * Update umer_debug_logger.py * debug: updated logs * checkim * Readded tests * Removed debug logs * Fixed Slow Tests * Added value ckecks | Updated model_cpu_offload_seq * accelerate-offloading works ; fast tests work * Made unet & addon explicit in controlnet * Updated slow tests * Added dtype/device to ControlNetXS * Filled in test model paths * Added image_encoder/feature_extractor to XL pipe * Fixed fast tests * Added comments and docstrings * Fixed copies * Added docs ; Updates slow tests * Moved changes to UNetMidBlock2DCrossAttn * tiny cleanups * Removed stray prints * Removed ip adapters + freeU - Removed ip adapters + freeU as they don't make sense for ControlNet-XS - Fixed imports of UNet components * Fixed test_save_load_float16 * Make style, quality, fix-copies * Changed loading/saving API for ControlNetXS - Changed loading/saving API for ControlNetXS - other small fixes * Removed ControlNet-XS from research examples * Make style, quality, fix-copies * Small fixes - deleted ControlNetXSModel.init_original - added time_embedding_mix to StableDiffusionControlNetXSPipeline .from_pretrained / StableDiffusionXLControlNetXSPipeline.from_pretrained - fixed copy hints * checkin May 11 '23 * CheckIn Mar 12 '24 * Fixed tests for SD * Added tests for UNetControlNetXSModel * Fixed SDXL tests * cleanup * Delete Pipfile * CheckIn Mar 20 Started replacing sub blocks by `ControlNetXSCrossAttnDownBlock2D` and `ControlNetXSCrossAttnUplock2D` * check-in Mar 23 * checkin 24 Mar * Created init for UNetCnxs and CnxsAddon * CheckIn * Made from_modules, from_unet and no_control work * make style,quality,fix-copies & small changes * Fixed freezing * Added gradient ckpt'ing; fixed tests * Fix slow tests(+compile) ; clear naming confusion * Don't create UNet in init ; removed class_emb * Incorporated review feedback - Deleted get_base_pipeline / get_controlnet_addon for pipes - Pipes inherit from StableDiffusionXLPipeline - Made module dicts for cnxs-addon's down/mid/up classes - Added support for qkv fusion and freeU * Make style, quality, fix-copies * Implemented review feedback * Removed compatibility check for vae/ctrl embedding * make style, quality, fix-copies * Delete Pipfile * Integrated review feedback - Importing ControlNetConditioningEmbedding now - get_down/mid/up_block_addon now outside class - renamed `do_control` to `apply_control` * Reduced size of test tensors For this, added `norm_num_groups` as parameter everywhere * Renamed cnxs-`Addon` to cnxs-`Adapter` - `ControlNetXSAddon` -> `ControlNetXSAdapter` - `ControlNetXSAddonDownBlockComponents` -> `DownBlockControlNetXSAdapter`, and similarly for mid/up - `get_mid_block_addon` -> `get_mid_block_adapter`, and similarly for mid/up * Fixed save_pretrained/from_pretrained bug * Removed redundant code --------- Co-authored-by: Dhruv Nair <[email protected]>

* is_cosxl_edit arg in SDXL ip2p. * Empty-Commit Co-authored-by: Yiyi Xu <[email protected]> * doc * remove redundant logic. * reflect drhuv's comments. --------- Co-authored-by: Yiyi Xu <[email protected]> Co-authored-by: Dhruv Nair <[email protected]>

* Create tgate.md * Update _toctree.yml * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/optimization/tgate.md Co-authored-by: Steven Liu <[email protected]> * Update tgate.md * Update tgate.md --------- Co-authored-by: Steven Liu <[email protected]>

…arning (#7637) Updated ruff configuration to avoid depreceated config. Co-authored-by: Sayak Paul <[email protected]>

* update * update

…ts (#7662) remove installation of redundant modules from flax PR tests

Update tgate.md

* pipelines * schedulers and models * community pipelines * feedback

* Switch to peft and multi proj layers * Move Face ID loading and inference to core --------- Co-authored-by: Sayak Paul <[email protected]>

* style * Fix device map nits (#7705) --------- Co-authored-by: Sayak Paul <[email protected]>

update

* update * update

* Fixed type annotations for compatability with python 3.8 * Add required imports.

add tailscale key in case of failure

* fixed encode_image function signature in controlnet animatediff * copied encode_image from stable diffusion pipeline --------- Co-authored-by: YiYi Xu <[email protected]>

bghira and others added 30 commits April 2, 2024 20:15

add: utility to format our docs too 📜 (#7314)

4a34307

* add: utility to format our docs too 📜 * debugging saga * fix: message * checking * should be fixed. * revert pipeline_fixture * remove empty line * make style * fix: setup.py * style.

UniPC Multistep fix tensor dtype/device on order=3 (#7532)

19ab04f

* UniPC UTs iterate solvers on FP16 It wasn't catching errs on order==3. Might be excessive? * UniPC Multistep fix tensor dtype/device on order=3 * UniPC UTs Add v_pred to fp16 test iter For completions sake. Probably overkill?

[Chore] increase number of workers for the tests. (#7558)

ad55ce6

* increase number of workers for the tests. * move to beefier runner. * improve the fast push tests too. * use a beefy machine for pytorch pipeline tests * up the number of workers further.

Update pipeline_animatediff_video2video.py (#7457)

35db2fd

* Update pipeline_animatediff_video2video.py * commit with test for whether latent input can be passed into animatediffvid2vid

Skip test_freeu_enabled on MPS (#7570)

71f49a5

* Skip `test_freeu_enabled ` on MPS * Small fixes - import skip_mps correctly - disable all instances of test_freeu_enabled * Empty commit to trigger tests * Empty commit to trigger CI

[IF| add set_begin_index for all IF pipelines (#7577)

6133d98

add set_begin_index for all if pipelines

Allow more arguments to be passed to convert_from_ckpt (#7222)

7e39516

Allow safety and feature extractor arguments to be passed to convert_from_ckpt Allows management of safety checker and feature extractor from outside of the convert ckpt class. Co-authored-by: Sayak Paul <[email protected]>

[Docs] fix bugs in callback docs (#7594)

7e808e7

Add missing restore() EMA call in train SDXL script (#7599)

8e46d97

* Restore unet params back to normal from EMA when validation call is finished * empty commit --------- Co-authored-by: Sayak Paul <[email protected]>

disable test_conversion_when_using_device_map (#7620)

a341b53

* disable test * update --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>

Multi-image masking for single IP Adapter (#7499)

a0cf607

* Support multiimage masking --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

add utilities for updating diffusers pipeline metadata. (#7573)

ac7ff7d

* add utilities for updating diffusers pipeline metadata. * style * remove first empty line

add the option of upsample function for tiny vae (#7604)

b99b161

Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

[docs] remove duplicate tip block. (#7625)

a402431

remove duplicate tip block.

Modularize instruct_pix2pix SD inferencing during and after training …

37e9d69

…in examples (#7603) * Modularize instruct_pix2pix code * quality check * quality check --------- Co-authored-by: Sayak Paul <[email protected]>

[Tests] reduce the model sizes in the SD fast tests (#7580)

b2323aa

* give it a shot. * print. * correct assertion. * gather results from the rest of the tests. * change the assertion values where needed. * remove print statements.

[docs] Prompt enhancer (#7565)

1d48029

* prompt enhance * edits * align titles * feedback * feedback * feedback * link to style

[docs] T2I (#7623)

d95b993

* refactor t2i * add code snippets

Fix cpu offload related slow tests (#7618)

aa1f00f

* fix * up --------- Co-authored-by: yiyixuxu <yixu310@gmail,com>

[Core] fix img2img pipeline for Playground (#7627)

33c5d12

* playground vae encoding should use std and mean of the vae. * style. * fix-copies.

YiqinZhao and others added 23 commits April 11, 2024 09:08

Fixed YAML loading. (#7579)

8e14535

fix: Replaced deprecated logger.warn with logger.warning (#7643)

279de3c

Fixed deprecated logger.warn with logger.warning.

FIX Setting device for DoRA parameters (#7655)

2523390

Fix a bug that causes the the call to set_lora_device to ignore the DoRA parameters.

make docker-buildx mandatory. (#7652)

08bf754

fix: metadata token (#7631)

1c000d4

don't install peft from the source with uv for now. (#7679)

cf6e040

fix: Updated ruff configuration to avoid deprecated configuration w…

c726d02

…arning (#7637) Updated ruff configuration to avoid depreceated config. Co-authored-by: Sayak Paul <[email protected]>

Don't install PEFT with UV in slow tests (#7697)

f0fa17d

* update * update

[Workflows] remove installation of redundant modules from flax PR tes…

30c977d

…ts (#7662) remove installation of redundant modules from flax PR tests

[Docs] Update TGATE in section optimization. (#7698)

9132ce7

Update tgate.md

[docs] Pipeline loading (#7684)

7635d3d

* pipelines * schedulers and models * community pipelines * feedback

Add tailscale action to push_test (#7709)

e23c27e

Move IP Adapter Face ID to core (#7186)

b5c8b55

* Switch to peft and multi proj layers * Move Face ID loading and inference to core --------- Co-authored-by: Sayak Paul <[email protected]>

adding back test_conversion_when_using_device_map (#7704)

e567401

* style * Fix device map nits (#7705) --------- Co-authored-by: Sayak Paul <[email protected]>

Cast height, width to int inside prepare latents (#7691)

90250d9

update

Cleanup ControlnetXS (#7701)

3cfe187

* update * update

fix: Fixed type annotations for compatability with python 3.8 (#7648)

db969cc

* Fixed type annotations for compatability with python 3.8 * Add required imports.

fix/add tailscale key in case of failure (#7719)

ae05050

add tailscale key in case of failure

Animatediff Controlnet Community Pipeline IP Adapter Fix (#7413)

d1e3f48

* fixed encode_image function signature in controlnet animatediff * copied encode_image from stable diffusion pipeline --------- Co-authored-by: YiYi Xu <[email protected]>

Skquark merged commit 7d9be08 into Skquark:main Apr 21, 2024
5 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge changes #155

Merge changes #155

Skquark commented Apr 21, 2024

Merge changes #155

Merge changes #155

Conversation

Skquark commented Apr 21, 2024