Two deep reinforcement learning agents learning to play tennis in a Unity ML-Agents environment.
This project is my solution to the collaboration and competition project of udacity's Deep Reinforcement Learning Nanodegree.
Two agents control a tennis racket each and their goal is to bounce a ball over a net separating them. For this project the Tennis environment from unity ml agents is used.
reward function: Each of the agents is rewarded as follows:
- +0.1 if the agent hits the ball over the net
- -0.01 if the agent lets a ball hit the ground or hits the ball out of bounds
observation space: the observation space is continuous and consists of 8 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation.
action space: two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping.
The task is considered solved if the two agents receive at least a collective score of +0.5 over 100 episodes. The collective score is calculated as follows:
- After each episode, we add up the rewards that each agent received (without discounting), to get a score for each agent. This yields 2 (potentially different) scores. We then take the maximum of these 2 scores.
- This yields a single score for each episode.
The task is episodic.
The code is organized mainly in the following files:
- this high level file includes the high level functions of this project. In addition to gluing everything together it include implementation of the multi agent training algorithm.
- this file includes the implementation of a ddpg agent.
- this file simply includes the definition of a configuration file for a training job.
- this file includes the implementation of the neural networks used by the for the actor and the critic.
- Report.ipynb: this file includes:
- an introduction to the environment and task
- the code that trains the agents
- the demonstration of the trained agents playing tennis
A working python 3 environment is required. You can easily setup one installing [anaconda] (
Installation can be performed either installing directly in the OS or via docker. Please note that the docker image does not include a jupyter server installation therefore it only executes the training job.
Make sure that your system has docker installed and build the docker image from the same folder of the Dockerfile:
docker build --tag=deep-rl-tennis .
Run the container as follows:
docker run -v <configuration_file_in_host>:/deep-rl-tennis/config.yml:ro \
[-v <host_checkpoints_directory>:/deep-rl-tennis/checkpoints/:rw] \
[-v <host_sessions_directory>:/deep-rl-tennis/sessions/:rw] -it deep-rl-tennis
This command will run the container and will provide the configuration file that you provide in <configuration_file_in_host>.
Additionally, if mounted, the output of the training job will be serialized in a folder that the container will create inside the <host_checkpoints_directory> that you provide.
The <host_sessions_directory> folder, if mounted, will be storing the temporary best result obtained during the training session.
For more info on the content of the folder have a look at the code in at function save_state
If you are using anaconda is suggested to create a new environment as follows:
conda create --name tennis python=3.6
activate the environment
source activate tennis
start the jupyter server
python jupyter-notebook --no-browser --ip --port 8888 --port-retries=0
Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
(For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the "headless" version of the environment. You will not be able to watch the agent without enabling a virtual screen, but you will be able to train the agent. (To watch the agent, you should follow the instructions to enable a virtual screen, and then download the environment for the Linux operating system above.)
Decompress the archive at your preferred location (e.g. in this repository working copy).
Open Report.ipynb notebook
Write your path to the pre-compiled unity environment executable as indicated in the notebook.
Follow the instructions in
to get an environment introduction and to see my proposed solution to the task.