Skip to content

Latest commit

 

History

History
11 lines (6 loc) · 863 Bytes

README.md

File metadata and controls

11 lines (6 loc) · 863 Bytes

MuZero

A TensorFlow implementation of DeepMind's MuZero algorithm for self-learning games without any knowledge of the rules. The algorithm is implemented as described in the original paper and pseudocode. It supports prioritized replay and is parallelized with the help of Ray. The repo structure is based on a muzero-pytorch.

Train: python main.py --mode train --env CartPole-v1 --force

Test: python main.py --mode test --env CartPole-v1 --force

TensorBoard: tensorboard --logdir=result_dir

At the moment, the code has only been tested for simple OpenAI gym environments like CartPole. Results are fairly sensitive to choices of hyperparameters.