MuZero

A TensorFlow implementation of DeepMind's MuZero algorithm for self-learning games without any knowledge of the rules. The algorithm is implemented as described in the original paper and pseudocode. It supports prioritized replay and is parallelized with the help of Ray. The repo structure is based on a muzero-pytorch.

Train: python main.py --mode train --env CartPole-v1 --force

Test: python main.py --mode test --env CartPole-v1 --force

TensorBoard: tensorboard --logdir=result_dir

At the moment, the code has only been tested for simple OpenAI gym environments like CartPole. Results are fairly sensitive to choices of hyperparameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MuZero

Files

README.md

Latest commit

History

README.md

File metadata and controls

MuZero