Here contains the fine-tuning real-world demonstrations collected via PerAct demonstration collection interface.
Here contains the real-world demonstrations for all the nine tasks describle in the paper. This is for evaluating baseline perform on real-world collected data.
It is created using RGB video data for training visumotor policies, using the RGB data extracted from AR2-D2 interface.
AR2-D2 uses my Detectron2DeepSortPlus fork to track the hand pose for segmentation in the RGB video data.
Follow the INSTALL.md for installation instructions, and download the trained models at MODEL_ZOO.md.
cd <install_dir>
git clone -b peract https://github.com/jiafei1224/Detectron2DeepSortPlus.git
cd Detectron2DeepSortPlus
python yl2ds.py --input <PATH_TO_VIDEO> --tracker deepsort --weights <PATH_TO_WEIGHS> --out_vid <PATH_TO_OUTPUT_VIDEO> --device 'cuda:0'
AR2-D2 use my Segment Anything for generating the segmented masks for human hand.
Follow the instructions in the SAM repo to install.
cd <install_dir>
git clone -b peract https://github.com/jiafei1224/segment-anything.git
cd segment_anything
python find_mask.py
AR2-D2 use my E2FGVI to do inpainting of the hand for the RGB video data.
Follow their instruction to install and setup. You can alternatively run with the (E2FGVI_Verison.ipynb
).
cd <install_dir>
git clone https://github.com/jiafei1224/E2FGVI.git
conda env create -f environment.yml
conda activate e2fgvi
python test.py --model e2fgvi (or e2fgvi_hq) --video <PATH_TO_RGB_VIDEO> --mask <PATH_TO_GENERATED_MASK> --ckpt release_model/E2FGVI-CVPR22.pth (or release_model/E2FGVI-HQ-CVPR22.pth)
It is created using 3D Voxelized Input for training BC such as PerAct, using the RGBD data extracted from AR2-D2 interface.
cd Process_3DData
python process.py #Output two folder-> front_rgb and front_depth