Skip to content

Latest commit

 

History

History
29 lines (24 loc) · 850 Bytes

README.md

File metadata and controls

29 lines (24 loc) · 850 Bytes

VIP-Versatile-Image-Outpainting-Empowered-by-Multimodal-Large-Language-Model

This repository is the official implementation of VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

📜 News

🚀 [2024/9/26] Our paper has been accepted by ACCV 2024!

🚀 [2024/7/22] The training and inference code are released!

🚀 [2024/6/3] The paper is released!

🛠️ Usage

Requirements

- torch==1.13.1
- torchvision==0.14.1
- transformers==4.39.3

Note that in out method, there are some changes of UNet2DConditionModel in diffusers, please don't download the official diffusers dependency package.

For training

cd examples/VIP_ours/
bash train_on_enhanced_prompt.sh

For inference

cd examples/VIP_ours/
python3 inference_*.py