Bandit algorithms.

For a Reinforcement Learning class, I worked on a few popular algorithms for the bandit problem. Among them were :

Epsilon-greedy bandit
BESA
Softmax
UCB1
Thompson sampling
KL-UCB

Bandits are implemented in agent.py

How to use ?

For the purpose of the class, each agent was tested on a specific configuration : 1000 rounds for 2000 agents in parallel : python main.py --niter 1000 --batch 2000 Use python main.py -h to know more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Bandit algorithms.

How to use ?

Files

README.md

Latest commit

History

README.md

File metadata and controls

Bandit algorithms.

How to use ?