Skip to content

Latest commit

 

History

History
147 lines (133 loc) · 4.78 KB

README.md

File metadata and controls

147 lines (133 loc) · 4.78 KB

2nd Place Solution for FGVC9 with PyTorch Implementation

Device

ViT-large on 8 x Titan XP (12GB)
ViT-huge on 8 x 3090 ti (24GB)

Requirements

  • python 3.7
  • pytorch 1.7.1+cu101
  • torchvision 0.8.2
  • timm 0.3.2

Preparation

1. Get SnakeCLEF 2022 dataset

   root/
    ├─ SnakeCLEF2022-ISOxSpeciesMapping.csv
    ├─ train/
    │  ├─ SnakeCLEF2022-TrainMetadata.csv
    │  ├─ SnakeCLEF2022-small_size/
    │  ├─ SnakeCLEF2022-medium_size/
    │  └─ SnakeCLEF2022-large_size/
    └─ test/
        ├─ SnakeCLEF2022-TestMetadata.csv
        └─ SnakeCLEF2022-large_size/

2. Get MAE pretrained models following README_MAE.md

3. Calculate sample per class

python preprocess_sample_per_class.py

output: ./preprocessing/sample_per_class.json

4. Prepocess metadata

python preprocess_endemic_metadata.py

output: ./preprocessing/endemic_label.json

python preprocess_code_metadata.py

output: ./preprocessing/code_label_train.json
    ./preprocessing/code_label_test.json


ViT-large

Train

python -m torch.distributed.launch --nproc_per_node=8 main_finetune.py \
  --accum_iter 4 \
  --batch_size 2 \
  --input_size 432 \
  --model vit_large_patch16 \
  --epochs 50 \
  --blr 1e-3 \
  --layer_decay 0.75 \
  --weight_decay 0.05 --drop_path 0.2 --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
  --root root/to/your/data \
  --data snakeclef2022 \
  --nb_classes 1572 \
  --log_dir ./log_dir/vit_large_patch16_432_50e \
  --output_dir ./output_dir/vit_large_patch16_432_50e \
  --finetune ./pretrained_model/mae_pretrain_vit_large.pth \
  --use_prior --loss LogitAdjustment

Test and genrate submission file

python main_finetune.py \
  --accum_iter 4 \
  --batch_size 64 \
  --input_size 432 \
  --model vit_large_patch16 \
  --epochs 50 \
  --blr 1e-3 \
  --layer_decay 0.75 \
  --weight_decay 0.05 --drop_path 0.2 --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
  --root root/to/your/data \
  --data snakeclef2022 \
  --nb_classes 1572 \
  --log_dir ./log_dir/vit_large_patch16_432_50e \
  --output_dir ./output_dir/vit_large_patch16_432_50e \
  --resume ./output_dir/vit_large_patch16_432_50e/checkpoint-xx.pth \
  --use_prior --loss LogitAdjustment \
  --eval --test \
  --tencrop --crop_pct 0.875

ViT-huge

Train

python -m torch.distributed.launch --nproc_per_node=8 main_finetune.py \
  --accum_iter 4 \
  --batch_size 2 \
  --input_size 392 \
  --model vit_huge_patch14 \
  --epochs 45 \
  --blr 1e-3 \
  --layer_decay 0.8 \
  --weight_decay 0.05 --drop_path 0.2 --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
  --root root/to/your/data \
  --data snakeclef2022 \
  --nb_classes 1572 \
  --log_dir ./log_dir/vit_huge_patch14_392_40e \
  --output_dir ./output_dir/vit_huge_patch14_392_40e \
  --finetune ./pretrained_model/mae_pretrain_vit_huge.pth \
  --use_prior --loss LogitAdjustment

Test and genrate submission file

python main_finetune.py \
  --accum_iter 4 \
  --batch_size 64 \
  --input_size 392 \
  --model vit_huge_patch14 \
  --epochs 45 \
  --blr 1e-3 \
  --layer_decay 0.8 \
  --weight_decay 0.05 --drop_path 0.2 --reprob 0.25 --mixup 0.8 --cutmix 1.0 \
  --root root/to/your/data \
  --data snakeclef2022 \
  --nb_classes 1572 \
  --log_dir ./log_dir/vit_huge_patch14_392_40e \
  --output_dir ./output_dir/vit_huge_patch14_392_40e \
  --resume /data/mae/output_dir/vit_huge_patch14_392_40e/checkpoint-xx.pth \
  --use_prior --loss LogitAdjustment \
  --eval --test \
  --tencrop --crop_pct 0.875

Ensemble

python ensemble.py

Results

model resolution public private checkpoint
ViT-large 384 0.87996 0.81997 [Google]
ViT-large 432 0.89173 0.83063 [Google]
ViT-huge 392 0.89449 0.84057 [Google]
Ensemble -- 0.89822 0.84565 --