Skip to content
/ VIP Public

[ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"

License

Notifications You must be signed in to change notification settings

ucasyjz/VIP

Repository files navigation

VIP-Versatile-Image-Outpainting-Empowered-by-Multimodal-Large-Language-Model

This repository is the official implementation of VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

📜 News

🚀 [2024/9/26] Our paper has been accepted by ACCV 2024!

🚀 [2024/7/22] The training and inference code are released!

🚀 [2024/6/3] The paper is released!

🛠️ Usage

Requirements

- torch==1.13.1
- torchvision==0.14.1
- transformers==4.39.3

Note that in out method, there are some changes of UNet2DConditionModel in diffusers, please don't download the official diffusers dependency package.

For training

cd examples/VIP_ours/
bash train_on_enhanced_prompt.sh

For inference

cd examples/VIP_ours/
python3 inference_*.py

About

[ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages