Bandit algorithms.

For a Reinforcement Learning class, I worked on a few popular algorithms for the bandit problem. Among them were :

Epsilon-greedy bandit
BESA
Softmax
UCB1
Thompson sampling
KL-UCB

Bandits are implemented in agent.py

How to use ?

For the purpose of the class, each agent was tested on a specific configuration : 1000 rounds for 2000 agents in parallel : python main.py --niter 1000 --batch 2000 Use python main.py -h to know more.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
agent.py		agent.py
environment.py		environment.py
main.py		main.py
runner.py		runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bandit algorithms.

How to use ?

About

Releases

Packages

Languages

wesbz/BanditAgents

Folders and files

Latest commit

History

Repository files navigation

Bandit algorithms.

How to use ?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages