Author: CAO RUI
A implementation for solving reach target task based on Twin Delayed DDPG(TD3) with Hindsight Experience Replay(HER) using PaddlePaddle.
Python: Python 3.6+
PaddlePaddle : Deep learning framework
PARL : Reinforcement learning toolbox based on PaddlePaddle
gym : Universal environment builder for RL tasks
RLBench: RL tasks extension for robotics researches.
First, create a virtual environment by virtualenv
, in it, install PaddlePaddle, gym and PARL by
pip install requirements.txt
Then install RLBench via RLBench.
python rlbench_reach_td3_train.py
python rlbench_reach_td3_eval.py
4 stages (initial model, after 40000-episode trained model, after 80000-episode trained model, fianl model) of training model are uploaded and the corresponded render results are recorded in the folder records
.
The success rate as shown in the following figure, here every epoch equals to 200 training episodes.