Keras implementation for the CVPR 2017 workshop paper Self-Supervised Neural Aggregation Networks for Human Parsing
This code implements three kinds of models for human parsing dataset LIP
Currently only the re-implementation of the original SS-NAN method available.
Pixel Accuracy | Mean Accuracy | Mean IoU |
---|---|---|
85.8% | 58.1% | 47.90% |
keras 2.0.9
tensorflow 1.3.0
python 3.5.4
- Anaconda=5 (not neccessary just for convenience)
Please download the LIP dataset
python LIP.py evaluate --model path_to_model.h5 --dataset dataset_path/Single_Person --evalnum 0
evalnum=0 uses the whole valset. A positive evalnum indicates the number of images to use for evaluation
run
demo.py
to run test on some specific images (the main procedure is to call model.detect())
python LIP.py train --model path_to_model.h5 --dataset dataset_path/Single_Person trainmode pretrain
3 kinds of trainmodes available: pretrain, finetune, or fintune_ssloss_withdeep, which correspond to the 3 steps introduced in the paper Self-Supervised Neural Aggregation Networks for Human Parsing
Step1:
download pspnet_pretrainweights
set the parameters of model.train()
epochs=40,layers='all'
run
python LIP.py train --model pspnet --dataset dataset_path/Single_Person trainmode pretrain
Step2 : set the parameters of model.train()
epochs=30,layers='head'
train the Neural Aggregation Networks
python LIP.py train --model pretain.h5(the best model generated in step1 ) --dataset dataset_path/Single_Person trainmode
finetune
Step3 : set the parameters of model.train()
epochs=30,layers='psp5+'
train with Self-Supervised Loss
python LIP.py train --model finetune.h5(the best model generated in step2 ) --dataset dataset_path/Single_Person trainmode finetune_ssloss_withdeep
The final Pretrain_model can be downloaded here
Some codes are borrowed from the MASK RCNN Implementation
@inproceedings{Gong2017Look,
title={Look into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing},
author={Gong, Ke and Liang, Xiaodan and Zhang, Dongyu and Shen, Xiaohui and Lin, Liang},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
pages={6757-6765},
year={2017},
}
@inproceedings{Zhao2017Self,
title={Self-Supervised Neural Aggregation Networks for Human Parsing},
author={Zhao, Jian and Li, Jianshu and Nie, Xuecheng and Zhao, Fang and Chen, Yunpeng and Wang, Zhecan and Feng, Jiashi and Yan, Shuicheng},
booktitle={Computer Vision and Pattern Recognition Workshops},
pages={1595-1603},
year={2017},
}