Accepted by MICCAI2024.
This code repository is based on the official Chinese-CLIP(LINK)
| Paper |
- 2024.7.16 Fix several bugs in the code.
- 2024.7.10 Release the pretrained model using vit-b-16 as vision backbone.
- More updates coming soon...
To start with this project, make sure that your environment meets the requirements below:
python >= 3.6.4 pytorch >= 1.8.0 (with torchvision >= 0.9.0) CUDA Version >= 10.2
Run the following command to install required packages.
pip install -r requirements.txt
If you encounter any issue while downloading or using the pretrained model, please feel free to contact us.
Vision Backbone | Text Backbone | |
---|---|---|
ViT-b-16 | RoBERTa-wwm-ext-base-chinese | LINK |