official implementation of the paper Pruning from Scratch
- pytorch == 1.1.0
- torchvision == 0.2.2
- apex @ commit: 574fe24
- learning channel importance gates from randomly initialized weights
python script/ -a ARCH --gpu GPU_ID --seed SEED -s SPARSITY -e EXPANSION
where ARCH
is network architecture type,
is the sparsity ratio EXPANSION
is expansion channel number of initial conv layer.
- pruning based on channel gates
python script/ -a ARCH --gpu GPU_ID --seed SEED -s SPARSITY -e EXPANSION -p RATIO
where RATIO
is the pruned model MACs reduction ratio, larger ratio indicates more compact model.
- training pruned model from scratch
python script/ -a ARCH --gpu GPU_ID --seed SEED -s SPARSITY -e EXPANSION -p RATIO --budget_train
where --budget_train
activates the budget training scheme (Scratch-B) proposed in
Rethinking the Value of Network Pruning,
which trains the pruned model for the same amount of computation bud- get with the full model.
Empirically, this training scheme is crucial for improving the pruned model performance.
- prepare imagenet dataset following the instructions in link, which results in an imagenet folder with train and val sub-folders.
- generate image index by
python script/ --data_dir IMAGENET_DATA_DIR/train --dump_path data/train_images_list.pkl
python scrtpt/ --data_dir IMAGENET_DATA_DIR/val --dump_path data/val_images_list.pkl
- learning channel importance gates from randomly initialized weights
python script/ -a ARCH --gpu GPU_ID -s SPARSITY -e EXPANSION -m MULTIPLIER
is used to control the expansion of channel number on the backbone outputs,
is used to enlarge the intermediate channel numbers in InvertedResidual and Bottleneck blocks.
- pruning based on channel gates
python script/ -a ARCH --gpu GPU_ID -s SPARSITY -e EXPANSION -m MULTIPLIER -p RATIO
- training pruned model from scratch (single node multiple gpus)
python -m torch.distributed.launch --nproc_per_node=NUM_GPU script/ \
-b TRAIN_BATCH_SIZE --lr LR --wd WD --lr_scheduler SCHEDULER \
--budget_train --label_smooth
is learning rate scheduler type, 'multistep' for ResNet50, 'cos' for MobileNets.
title={Pruning from Scratch},
author={Wang, Yulong and Zhang, Xiaolu and Xie, Lingxi and Zhou, Jun and Su, Hang and Zhang, Bo and Hu, Xiaolin},
booktitle={Proceedings of the 29th International Joint Conference on Artificial Intelligence},
publisher={AAAI Press},
address={New York, USA}