By Xinghao Chen, Guijin Wang, Hengkai Guo, Cairong Zhang, Tsinghua University.
[ScienceDirect] [arXiv] [Project Page]
* Demos above are realtime results from Intel Realsense SR300 using models trained on Hands17 dataset.
* See more demos using pre-trained models on ICVL, NYU and MSRA in src/demo
.
This repository contains the demo code for Pose-REN, an accurate and fast method for depth-based 3D hand pose estimation.
If you find our work useful in your research, please consider citing:
@article{chen2018pose,
title={Pose Guided Structured Region Ensemble Network for Cascaded Hand Pose Estimation},
author={Chen, Xinghao and Wang, Guijin and Guo, Hengkai and Zhang, Cairong},
journal={Neurocomputing},
doi={https://doi.org/10.1016/j.neucom.2018.06.097},
year={2018}
}
- caffe-pose
- OpenCV (with python interface)
- Optional: librealsense (for live demo only)
Clone caffe-pose:
git clone https://github.com/xinghaochen/caffe-pose.git
Install caffe:
cd caffe-pose
cp Makefile.config.example Makefile.config
# uncomment WITH_PYTHON_LAYER := 1
# change other settings accordingly
make -j16
make pycaffe -j16
Add path/to/caffe-pose/python
to PYTHONPATH.
We use a new layer called GenerateROILayer in Pose-REN and the python and c++ implementations are located in src/libs
.
If you prefer using python layer, add path/to/src/libs
to PYTHONPATH, otherwise copy generate_roi_layer.hpp/cpp
to caffe-pose
, update caffe.proto
with the provided patch caffe.patch.proto
and build caffe again.
The tables below show the predicted labels and pretrained models on ICVL, NYU and MSRA dataset. All labels are in the format of (u, v, d) where u and v are pixel coordinates.
Dataset | Predicted Labels | Models |
---|---|---|
ICVL | Download | [Google Drive] or [Baidu Cloud] |
NYU | Download | [Google Drive] or [Baidu Cloud] |
MSRA | Download | [Google Drive] or [Baidu Cloud] |
HANDS17 | - | [Google Drive] or [Baidu Cloud] |
Please use the Python script src/show_result.py
to visualize the predicted results:
$ python src/show_result.py icvl your/path/to/ICVL/test/Depth --in_file=results/NEUCOM18_ICVL_Pose_REN.txt
You can see all the testing results on the images. Press 'q' to exit.
First copy and modify the example config.py
for your setup. Please change data_dir
and anno_dir
accordingly.
$ cp config.py.example config.py
Use the Python script src/testing/predict.py
for prediction with predefined centers in labels
directory:
$ python src/testing/predict.py icvl your/path/to/output/file.txt
The script depends on pycaffe.
Please see here for how to evaluate performance of hand pose estimation.
We provide a realtime hand pose estimation demo using Intel Realsense device. Note that we just use a naive depth thresholding method to detect the hand. Therefore, the hand should be in the range of [0, 650mm] to run this demo. We tested this realtime demo with an Intel Realsense SR300.
Please use your right hand for this demo and try to avoid clustered foreground and redundant arm around the hand.
Python demo with librealsense [recommended]
First compile and install the librealsense and its python wrapper. After everything is working properly, just run the following python script for demo:
python src/demo/realsense_realtime_demo_librealsense2.py
By default this script uses pre-trained weights on ICVL dataset. You can change the pre-trained model by specifying the dataset.
python src/demo/realsense_realtime_demo_librealsense2.py nyu/msra/icvl/hands17
Notes: The speed of this python demo is not optimal and it runs slightly slower than the c++ demo.
First compile and build:
cd src/demo/pose-ren-demo-cpp
mkdir build
cd build
cmake ..
make -j16
Run the demo by:
cd .. # redirect to src/demo/pose-ren-demo-cpp
./build/src/PoseREN # run
By default it uses pre-trained weights on Hands17 dataset. You can change the pre-trained model by specifying the dataset.
./build/src/PoseREN nyu/msra/icvl/hands17
Notes: This C++ demo is not fully developed and you may have to deal with some dependency problems to make it works. It serves as a preliminary project to demonstrate how to use Pose-REN in C++.
Our code and pre-trained models are available for non-commercial research purposes only.
chenxinghaothu at gmail.com