Exhaustive Reinforcement Learning

This repository aims to exhaustively implement various Deep Reinforcement Learning concepts covering most of the well-known resources from textbooks to lectures. For each notion, concise notes are provided to explain, and associated algorithms are implemented in addition to their environments and peripheral modules. At the end of this readme file, Reinforcement Learning's key papers and worthwhile resources are cited.

Motivations

Pseudocode and Algorithms

Textbooks Taxonomy

Tabular Methods
- Bandit Problem
- Dynamic Programming
- Monte Carlo Methods
- Temporal-Difference Learning
- n-step Bootstrapping
- Planning and Learning
Approximate Solution Methods
- On-policy Prediction With Approxiamtion
  - Gradient Monte Carlo
  - Semi-Gradient TD(0)
- On-policy Control With Approxiamtion
  - Semi-Grdient SARSA
  - Semi-Gradient n-step SARSA
- Off-policy Control With Approxiamtion
- Eligibility Traces
- Policy Gradient Methods
  - REINFORCE
  - one-step Actor-Critic
Deep Reinforcement Learning Methods
- Value-Based Methods
  - Neural Fitted Q-function (NFQ)
  - DQN
  - DDQN
  - Dueling DDQN
  - PER
  - C51
  - QR-DQN
  - HER
- Policy-Based Methods
  - REINFORCE
  - VPG
  - PPO
  - TRPO
- Stochastic Actor-Critic Methods
  - A2C
  - A3C
  - GAE
  - ACKTR
- Deterministic Actor-Critic Methods
  - Deep Deterministic Policy Gradient (DDPG)
  - TD3
  - SAC

Environments

Black Jack
- Monte Carlo Prediction
- Monte Carlo Exploring Starts
CartPole
- Fully Connected Q-function
- DQN
- DDQN
- Dueling DQN
Cliff Walking
- SARSA
- Q-Learning
- Expected SARSA
Gambler's Problem
- Value Iteration
Grid World
- Iterative Policy Evaluation
Jack's Car Rental
- Policy Iteration
Lunar Lander
- REINFORCE using Non-linear Approximation
- VPG
Small MDP (Maximization Bias)
- Q-Learning
- Double Q-Learning
Mountain Climbing
- Semi-Gradient SARSA
- Semi-Gradient n-step SARSA
Multi-Armed Bandit
- Simple Bandit
- Gradient Bandit
Pendulum Swing-Up
- Actor-Critic using Tile-coding
- Actor-Critic Countinous Action Space
Random Walk
- n-step TD Prediction
- Gradient Monte Carlo State Aggregation
- Gradient Monte Carlo Tile Coding
- Semi-Gradient TD(0) State Aggregation
Short Corridor Gridworld
- REINFORCE (Policy Gradient) using Linear Approximation
- REINFORCE with Baseline
Windy Grid World
- SARSA

Relevant Resourses

Textbooks

Reinforcement Learning An Introduction. Richard S. Sutton and Andrew G. Barto
Algorithms for Reinforcement Learning. Csaba Szepesvari
Foundations of Deep Reinforcement Learning: Theory and Practice in Python. Laura Graesser and Wah Loon Keng
Grokking Deep Reinforcement Learning. Miguel Morales
Deep Reinforcement Learning Hands-On: Apply modern RL methods to practical problems of chatbots, robotics, discrete optimization, web automation, and more, 2nd Edition. Maxim Lapan
Deep Reinforcement Learning with Python, Second Edition. Sudharsan Ravichandiran
Deep Reinforcement Learning in Action, Brandon Brown and Alexander Zai
Deep Reinforcement Learning Fundamentals, Research and Applications. Hao Dong, Zihan Ding, and Shanghang Zhang

Courses

Artificial Inteligence
- UC Berkeley CS188: Introduction to Artificial Intelligence
Reinforcement Learning
Deep Reinforcement Learning

Useful Blogs

Lil'Log - Lilian Weng
Seita's Place - Daniel Seita
endtoend.ai - Seungjae Ryan Lee

Articles

Better Exploration with Parameter Noise

Key Papers

Actor-Critic -
REINFORCE - Williams, R. J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 1992 83 8, 229–256 (1992).

Deep Reinforcement Learning

Contribution

When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the me before making a change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Exhaustive Reinforcement Learning

Motivations

Table of Contents

Pseudocode and Algorithms

Textbooks Taxonomy

Environments

Relevant Resourses

Textbooks

Courses

Useful Blogs

Articles

Key Papers

Contribution

Files

README.md

Latest commit

History

README.md

File metadata and controls

Exhaustive Reinforcement Learning

Motivations

Table of Contents

Pseudocode and Algorithms

Textbooks Taxonomy

Environments

Relevant Resourses

Textbooks

Courses

Useful Blogs

Articles

Key Papers

Contribution