Deep Reinforcement Learning
This repository contain my work regarding Deep Reinforcment Learning. You will find the main topics organized by squence with there implementaion in PyTorch. Also, It contains some of my project or links on reinforcment learning. You are highly encouraged to modify and play with them!.
- Notes: Implementation of various DRL algorithms with notes regarding them
- Benchmarking DRL Algorithms on Classic Games (discrete)
- Benchmarking DRL Algorithms on Unity Ml Agents (continous)
Notes
- Introduction to Reinforcement Learning
- Dynamic Programing: Implement Dynamic Programming algorithms such as Policy Evaluation, Policy Improvement, Policy Iteration, and Value Iteration.
- Monte carlo: Implement Monte Carlo methods for prediction and control.
- Temporal Difference: Implement Temporal-Difference methods such as Sarsa, Q-Learning, and Expected Sarsa.
- Deep Q-learning
- Policy Gradients
- Actor Critic (A2C & A3C)
- Proximal Policy Gradients
- Deep Deterministic Policy Gradient
- Twin Delayed DDPG
- Soft Actor-Critic
Benchmarking DRL Algorithm(Discrete) on Classic Games
We used classic games from OpenAI Gym and Vizdoom as our main testbed to study the behaviour of the following algorithms:
- DQN — Deep Q-learning
- DDQN — Dueling DQN
- Rainbow
- Reinforce + Actor Critic
- A2C — Advantage Actor Critic
- PPO — Proximal Policy Optimization
We compare the results of launch of six algorithms on games from two perspectives: Traning and Testing. Same Deep Neural Network is used for all algorithms. Click on particular game for more information.
Games | Game Difficulty | Implementations |
---|---|---|
Atari Ping Pong | Read More | |
Atari Space Invaders | Read More | |
Doom Defend Center | Read More | |
Doom Deadly Corridor | Read More | |
Sonic the Hedgehog | Read More | |
Sonic the Hedgehog | Read More |
Benchmarking DRL Algorithm(Continous) on Unity Ml Agents
We used mlagents from Unity as our main testbed to study the behaviour of the following algorithms:
- PPO - Proximal Policy Optimization
- DDPG - Deep Deterministic Policy Gradient
- TD3 - Twin Delayed DDPG
- SAC - Soft Actor-Critic
We compare the results of launch of six algorithms on games from two perspectives: Traning and Testing. Same Deep Neural Network is used for all algorithms. Click on particular game for more information.
Games | Game Difficulty | Implementations |
---|---|---|
3D Balance Ball | Read More | |
Tennis | Read More | |
Wall Jump | Read More | |
Reacher | Read More | |
Soccer Twos | Read More | |
Walker | Read More |
References
Algorithms implementation resembles below references. Content is only for eduactional purposes no claim on orignality of content or structure of repository.
-
https://github.com/udacity/deep-reinforcement-learning
(Udacity Deep Reinforcement Learning Nanodegree) -
https://simoninithomas.github.io/Deep_reinforcement_learning_Course
(Deep Reinforcement Learning course by SIMONINI Thomas) -
https://github.com/higgsfield/RL-Adventure
(RL-Adventure: Dqn by Dulat Yerzat) -
https://github.com/higgsfield/RL-Adventure-2
(RL-Adventure-2: Policy Gradients by Dulat Yerzat)
Any questions
If you have any questions, feel free to ask me:
- Mail: deepanshut041@gmail.com
- Github: https://github.com/deepanshut041/Reinforcement-Learning
- Website: https://deepanshut041.github.io/Reinforcement-Learning
- Twitter: @deepanshut041
Don’t forget to follow me on twitter, github and Medium to be alerted of the new articles that I publish
How to help
- Clap on articles: Clapping in Medium means that you really like my articles. And the more claps I have, the more my article is shared help them to be much more visible to the deep learning community.
- Improve our notebooks: if you found a bug or a better implementation you can send a pull request.