Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Pascal Klink committed Oct 18, 2019
0 parents commit 354c6e9
Show file tree
Hide file tree
Showing 111 changed files with 5,425 additions and 0 deletions.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.idea
.DS_Store
logs
plot_logs
plots
venv
*/__pycache__
*/*/__pycache__
*/*/*/__pycache__
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2019 Pascal Klink

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
63 changes: 63 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Self Paced Contextual Reinforcement Learning
Implementation of the Self Paced Contextual Reinforcement Learning Experiments

## Installation

It is easiest to setup a virtual environment in order to install the required site-packages without modifying your global python installation. We are using Python3 (to be precise 3.5.2 on Ubuntu 16.04.5 LTS) and hence (assuming the code from this repository is in [DIR]), the following lines of code setup the virtualenv and install the required packages

```bash
cd [DIR]
python3 -m venv env
source env/bin/activate
pip3 install -r requirements.txt
```

If you want to run the experiments from the ''reacher'' or ''reacher-obstacle'' environment, you need will MuJoCo. If you have MuJoCo installed, be sure that you placed the corressponding binary and license key in the `~/.mujoco/` directory as described [here](https://github.com/openai/mujoco-py) (you may need to create the directory). This is necessary, because the mujoco-py package (which allows using MuJoCo from Python), relies on MuJoCo being located in this specific directory. If everything is setup, you need to run the following command (we assume that you still have the virtualenv activated):

```bash
pip3 install -r requirements_ext.txt
```

This will install OpenAI Gym and mujoco-py in the required versions.

## Usage

The experiments in the Gate Environment can be run with the following commands

```bash
python3 run_experiment.py --n_cores 10 --environment gate --setting precision --n_experiments 40 --algorithm sprl
python3 run_experiment.py --n_cores 10 --environment gate --setting precision --n_experiments 40 --algorithm creps
python3 run_experiment.py --n_cores 10 --environment gate --setting precision --n_experiments 40 --algorithm cmaes
python3 run_experiment.py --n_cores 10 --environment gate --setting precision --n_experiments 40 --algorithm goalgan
python3 run_experiment.py --n_cores 10 --environment gate --setting precision --n_experiments 40 --algorithm saggriac
python3 run_experiment.py --n_cores 10 --environment gate --setting global --n_experiments 40 --algorithm sprl
python3 run_experiment.py --n_cores 10 --environment gate --setting global --n_experiments 40 --algorithm creps
python3 run_experiment.py --n_cores 10 --environment gate --setting global --n_experiments 40 --algorithm goalgan
python3 run_experiment.py --n_cores 10 --environment gate --setting global --n_experiments 40 --algorithm saggriac
python3 visualize_results.py --n_cores 10 --environment gate --setting precision --add_cmaes --add_goalgan --add_saggriac
python3 visualize_results.py --n_cores 10 --environment gate --setting global --add_goalgan --add_saggriac
```

The first two commands create the experimental data in the "precision" setting for all algorithms. The next two commands do the same for the "global" setting. Finally, the results in both settings are visualized using the last command. Note that for the visualization, we recompute the performance and hence the visualization takes a bit of time when it is first run (not as much as the experiments however). However, the computed data is stored to disk and hence subsequent executions of the "visualize_results.py" script will render the data right away.

Allthough we create 10 subprocesses, our machine only had a quad-core processor (Core i7-7700), so it is not necessary to have 10 physical cores to run the script without problems. Note that you may nonetheless change the number of cores as desired. However, this also changes the seeds in the created subprocesses and hence will minimally alter the results.

To run the experiments in the other environments, you can use "--environment reacher-obstacle" or "--environment ball-in-a-cup". In this case, you do not need to set a "--setting" option. So to run the experiments in the modified reacher environment, you would e.g. run

```bash
python3 run_experiment.py --n_cores 10 --environment reacher-obstacle --n_experiments 40 --algorithm sprl
python3 run_experiment.py --n_cores 10 --environment reacher-obstacle --n_experiments 40 --algorithm creps
python3 run_experiment.py --n_cores 10 --environment reacher-obstacle --n_experiments 40 --algorithm cmaes
python3 run_experiment.py --n_cores 10 --environment reacher-obstacle --n_experiments 40 --algorithm goalgan
python3 run_experiment.py --n_cores 10 --environment reacher-obstacle --n_experiments 40 --algorithm saggriac
python3 visualize_results.py --n_cores 10 --environment reacher-obstacle --add_cmaes --add_goalgan --add_saggriac
```

Please note that this took multiple days in our lab setting, since the corressponding MuJoCo simulation is somewhat expensive.

You can see all the additional flags of the `run_experiment.py` and `visualize_results.py` scripts via

```bash
python3 run_experiment.py -h
python3 visualize_results.py -h
```
Binary file added reacher-obstacle-background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 7 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
matplotlib==3.0.2
matplotlib2tikz==0.6.18
numpy==1.16.4
scikit-learn==0.20.2
scipy==1.1.0
cma==2.7.0
mpi4py
4 changes: 4 additions & 0 deletions requirements_ext.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
gym==0.10.8
mujoco-py==2.0.2.2
stable-baselines==2.5.1
tensorflow==1.13.1
Loading

0 comments on commit 354c6e9

Please sign in to comment.