forked from huggingface/diffusers
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
example: Train Instruct pix2 pix with lora implementation (huggingfac…
…e#6469) * base template file - train_instruct_pix2pix.py * additional import and parser argument requried for lora * finetune only instructpix2pix model -- no need to include these layers * inject lora layers * freeze unet model -- only lora layers are trained * training modifications to train only lora parameters * store only lora parameters * move train script to research project * run quality and style code checks * move train script to a new folder * add README * update README * update references in README --------- Co-authored-by: Rahul Raman <[email protected]> Co-authored-by: Sayak Paul <[email protected]>
- Loading branch information
1 parent
3be7c96
commit 2d1f218
Showing
2 changed files
with
1,125 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# InstructPix2Pix text-to-edit-image fine-tuning | ||
This extended LoRA training script was authored by [Aiden-Frost](https://github.com/Aiden-Frost). | ||
This is an experimental LoRA extension of [this example](https://github.com/huggingface/diffusers/blob/main/examples/instruct_pix2pix/train_instruct_pix2pix.py). This script provides further support add LoRA layers for unet model. | ||
|
||
## Training script example | ||
|
||
```bash | ||
export MODEL_ID="timbrooks/instruct-pix2pix" | ||
export DATASET_ID="instruction-tuning-sd/cartoonization" | ||
export OUTPUT_DIR="instructPix2Pix-cartoonization" | ||
|
||
accelerate launch finetune_instruct_pix2pix.py \ | ||
--pretrained_model_name_or_path=$MODEL_ID \ | ||
--dataset_name=$DATASET_ID \ | ||
--enable_xformers_memory_efficient_attention \ | ||
--resolution=256 --random_flip \ | ||
--train_batch_size=2 --gradient_accumulation_steps=4 --gradient_checkpointing \ | ||
--max_train_steps=15000 \ | ||
--checkpointing_steps=5000 --checkpoints_total_limit=1 \ | ||
--learning_rate=5e-05 --lr_warmup_steps=0 \ | ||
--val_image_url="https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png" \ | ||
--validation_prompt="Generate a cartoonized version of the natural image" \ | ||
--seed=42 \ | ||
--rank=4 \ | ||
--output_dir=$OUTPUT_DIR \ | ||
--report_to=wandb \ | ||
--push_to_hub | ||
``` | ||
|
||
## Inference | ||
After training the model and the lora weight of the model is stored in the ```$OUTPUT_DIR```. | ||
|
||
```bash | ||
# load the base model pipeline | ||
pipe_lora = StableDiffusionInstructPix2PixPipeline.from_pretrained("timbrooks/instruct-pix2pix") | ||
|
||
# Load LoRA weights from the provided path | ||
output_dir = "path/to/lora_weight_directory" | ||
pipe_lora.unet.load_attn_procs(output_dir) | ||
|
||
input_image_path = "/path/to/input_image" | ||
input_image = Image.open(input_image_path) | ||
edited_images = pipe_lora(num_images_per_prompt=1, prompt=args.edit_prompt, image=input_image, num_inference_steps=1000).images | ||
edited_images[0].show() | ||
|
||
``` | ||
|
||
## Results | ||
|
||
Here is an example of using the script to train a instructpix2pix model. | ||
Trained on google colab T4 GPU | ||
|
||
```bash | ||
MODEL_ID="timbrooks/instruct-pix2pix" | ||
DATASET_ID="instruction-tuning-sd/cartoonization" | ||
TRAIN_EPOCHS=100 | ||
``` | ||
|
||
Below are few examples for given the input image, edit_prompt and the edited_image (output of the model) | ||
|
||
<p align="center"> | ||
<img src="https://github.com/Aiden-Frost/Efficiently-teaching-counting-and-cartoonization-to-InstructPix2Pix.-/blob/main/diffusers_result_assets/edited_image_results.png?raw=true" alt="instructpix2pix-inputs" width=600/> | ||
</p> | ||
|
||
|
||
Here are some rough statistics about the training model using this script | ||
|
||
<p align="center"> | ||
<img src="https://github.com/Aiden-Frost/Efficiently-teaching-counting-and-cartoonization-to-InstructPix2Pix.-/blob/main/diffusers_result_assets/results.png?raw=true" alt="instructpix2pix-inputs" width=600/> | ||
</p> | ||
|
||
## References | ||
|
||
* InstructPix2Pix - https://github.com/timothybrooks/instruct-pix2pix | ||
* Dataset and example training script - https://huggingface.co/blog/instruction-tuning-sd | ||
* For more information about the project - https://github.com/Aiden-Frost/Efficiently-teaching-counting-and-cartoonization-to-InstructPix2Pix.- |
Oops, something went wrong.