Skip to content

JunhongXu/pytorch-a2c-minecraft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pytorch version of Parallel Actor Critic

Tried to solve the MinecraftBasic-v0 task using A2C. Gym environment code are copied and modified from https://github.com/openai/baselines.

Requirements

  1. gym: https://github.com/openai/gym
  2. minecraft-py: https://github.com/tambetm/minecraft-py
  3. gym_minecraft: https://github.com/tambetm/gym-minecraft
  4. pytorch: https://github.com/pytorch/pytorch

python 2.7 is preferred if we want to use minecraft-py.

Run

  1. Minecraft environment: python run_minecraft.py

  2. LunarLander environment: python run_lunarlander.py

The parameters can be adjusted like number of processes, gamma value, learning rate, etc.

Results

  1. Minecraft:

alt-text-1 alt-text-2 alt-text-3 alt-text-4 alt-text-5 alt-text-2 alt-text-2 alt-text-2 alt-text-2 alt-text-2

The recorded results from episode 0(failed), 400, 800, 1200, 1600, 2000, 2400, 2800, 3200, 3600. As we can see, the last episode can quickly navigate to the goal. However, the result is not so promising, the agent still needs more than 20 steps to reach the goal. After the second iteration, the agent soon found a sub-optimal policy.

Averaged rewards: alt-text-1

  1. LunarLander: In progress....

Issues:

Unlike Atari or other openai environments, Minecraft will not stop and wait for the agent to execute an action. Therefore, agents are missing lots of observations if opening too many processes. This will degrade the performance as discussed in tambetm/gym-minecraft#3

We can set the tick to a higher value in task markdown.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages