Implementation of HIRO: Data-Efficient Hierarchical Reinforcement Learning

This repository implements the HIRO algorithm for Hierarchical Reinforcement Learning on the original AntMaze environment as presented by Ofir Nachum (Data-Efficient Hierarchical Reinforcement Learning, 2018)

Dependencies

gym==0.16.0
mujoco-py==1.50.1.68
tensorflow==2.0
wandb==0.8.29
omegaconf==1.4.1
numpy==1.18.1

Usage

$ python3 main.py ant_config

This loads the settings in the experiments/ant_config.yaml which trains the agent for 1.5 millions steps. Every 20000 timesteps, 10 evaluative episodes are played where exploratory noise is turned off. The performance of the agent is recorded and the model parameters are saved. Run:

$ python3 main.py ant_render

to then load that model and render the environment. I use OmegaConf to load different configurations. The default settings are kept in configs/ant_default while configs for specific experiments are saved in experiments/. I use the wandb framework to save and analyse data from different runs.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agent_files		agent_files
configs		configs
environments		environments
experiments		experiments
rl_algos		rl_algos
training_loops		training_loops
utils		utils
README.md		README.md
ant.gif		ant.gif
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation of HIRO: Data-Efficient Hierarchical Reinforcement Learning

Dependencies

Usage

About

Releases

Packages

Languages

P-Schumacher/ant_repo

Folders and files

Latest commit

History

Repository files navigation

Implementation of HIRO: Data-Efficient Hierarchical Reinforcement Learning

Dependencies

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages