Skip to content

Implementing different reinforcement learning algorithms on different gym environments and comparing results.

License

Notifications You must be signed in to change notification settings

vstark21/RL_gym

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RL gym

Implementing different reinforcement learning algorithms on different gym environments.

These algorithms are implemented in this repo:

And tested on these environments.

Cartpole
Pendulum
Acrobat
Lunar Lander Continuous

A2C

A2C is a on-policy, model-free reinforcement learning algorithm. Here is the pseudo code for A3C which is almost similar to A2C.

Agent trained using A2C playing Acrobat game.

DDPG

DDPG is a off-policy, model-free reinforcement learning algorithm. Here is the pseudo code for DDPG


Agent trained using DDPG playing lunar lander game.

Double_DQN

Double DQN is a off-policy, model-free reinforcement learning algorithm. Here is the pseudo code for Double DQN

Agent trained using Double DQN playing Cartpole game.

Dueling_DQN

Similar to DDQN, dueling network contains two separate estimators: one for the state value function and one for the state-dependent action advantage function.

Formula for the decomposition of Q-value:

  • θ is shared parameter for the network.
  • α parameterizes output stream for advantage function Α.
  • β parameterizes output stream for value function V.
Agent trained using Dueling DQN playing Acrobat game.

TD3

TD3 is a off-policy, model-free reinforcement learning algorithm. Here is the pseudo code for TD3


Agent trained using TD3 playing Pendulum game.

References

About

Implementing different reinforcement learning algorithms on different gym environments and comparing results.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published