Skip to content

The Multi-armed bandit problem is one of the classical reinforcements learning problems that describe the friction between the agent's exploration and exploitation.

License

Notifications You must be signed in to change notification settings

alexandrulita91/multi-armed-bandit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-armed bandit

The multi-armed bandit problem is one of the classical reinforcements learning problems that describe the friction between the agent's exploration and exploitation.

Thompson sampling

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance.

More details can be found in this paper.

Demo video

https://www.youtube.com/watch?v=I0XmHQJPaVM

Requirements

How to install the packages

You can install the required Python packages using the following command:

  • pipenv sync

How to train the agent

You can train the agent using the following command:

  • pipenv run python ts_bandits.py

About

The Multi-armed bandit problem is one of the classical reinforcements learning problems that describe the friction between the agent's exploration and exploitation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages