The (introductory) notes included Bandit Algorithms, MDP, Model-free Methods, Value Function Approximation, Policy Optimization. For the state-of-the-art advances, one can refer to paper directly and some excellent blog.
Hope you enjoy your learning.