Reward Prediction Error Prioritised Experience Replay (RPE-PER)

The following repository contains the PyTorch implementation of Reward Prediction Error Prioritisation Experience Replay (RPE-PER). It is integrated into two off-policy RL algorithms: TD3 and SAC.

The algorithm is tested on MuJoCo continuous control suite.

Network Architecture

Instructions

Prerequisite Versions

Library	Version
pydantic	1.10.10
MuJoCo	2.3.3

Training

To train the RD-PER TD3 or SAC, use the following command:

python3 training_loop_SAC.py
# or
python3 training_loop_TD3.py

Citation

Please cite the paper and the github repository if used.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.idea		.idea
algorithms		algorithms
config		config
networks		networks
readme_media		readme_media
README.md		README.md
__init__.py		__init__.py
training_loop_SAC.py		training_loop_SAC.py
training_loop_TD3.py		training_loop_TD3.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reward Prediction Error Prioritised Experience Replay (RPE-PER)

Network Architecture

Instructions

Prerequisite Versions

Training

Citation

About

Releases

Packages

Contributors 2

Languages

UoA-CARES/RPE-PER

Folders and files

Latest commit

History

Repository files navigation

Reward Prediction Error Prioritised Experience Replay (RPE-PER)

Network Architecture

Instructions

Prerequisite Versions

Training

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages