In this project, the agent learns how to navigate in a world full blue and yellow bananas. Its goal is to collect all the yellow bananas and avoid the blue ones. The method used to teach the agent to behave in such environment was the Deep Q-Network.
- Banana World Backstory
- The agent and the environment
- Dependencies
- Getting Started
- Demo
- Running the application
The night is dark and full of bananas.
In the Banana World, there was a severe lack of food, and the only fruit that was easy to obtain were bananas. The inhabitants of this world were scientists, and as such, they did a number of experiments on the bananas. These experiments resulted in a banana storm, flooding the world with bananas. These bananas were one of two kinds: the blue and the yellow ones. The blue ones are not edible, while the yellow ones are.
Now, the scientists built an agent to collect the yellow bananas, so they can eat and use them to plant new ones. The following gif is a sample of the banana world.
Demo of random agent going through the environment
The agent receives a reward of +1 for collecting yellow bananas and -1 for collecting blue bananas. This is in line with the goal previously mentioned.
The state space has 37 dimensions, and contains the agent’s velocity, along with ray-based perception of objects around the agent’s forward direction. The agent MUST learn how to best select its actions. Four discrete actions are available, corresponding to:
0
- move forward1
- move backward2
- move left3
- move right
The task is episodic, and will be considered solved if the agent reaches a score of +13 over 100 consecutive episodes.
This project is a requirement from the Udacity Deep Reinforcement Learning Nanodegree. The environment is provided by Udacity. It depends on the following packages:
- Python 3.6
- Numpy
- PyTorch
- Unity ML-Agents Beta v0.4
- Install python3.6 (any version above is not compatible with the unity ml-agents version needed for this environment)
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.6-full
- (Optional) Create a virtual environment for this project
cd <parent folder of venv>
python3.6 -m venv <name of the env>
source <path to venv>/bin/activate
- Install the python dependencies
python3 -m pip install numpy torch
- Download the Unity ML-Agents release file for version Beta v0.4. Then, unzip it at folder of your choosing
- Build Unity ML-Agents
cd <path ml-agents>/python
python3 -m pip install .
- Clone this repository and download the environment created by Udacity and unzip it at the world folder
git clone https://github.com/jhonasiv/banana-navigation.git
cd banana-navigation
mkdir world
wget https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Linux.zip
unzip Banana_Linux.zip -d world
This is how it looks after the agent was trained until it had reached the average score of 15.0
.
For more details on the implementation and the results, check out the Report.md file.
Agent trained with the average score of 15.0
- Execute the main.py file
python3 src/main.py
- For more information on the available command line arguments, use:
python3 src/main.py --help
- Some notable cli arguments:
--eval
: runs the application in evaluation mode, skipping training step,model_path
must be set--model_path
: path of the model to be loaded for the agent's local network, relative to the src folder--env_path
: path for the loaded Unity Enviroment
- Some notable cli arguments: