General information:

- program of the course : PDF version
- reference books
- a series of fun video examples of applications of reinforcement learning
- course forum (questions and discussions)
- the exercises are evaluated on the challenge platform codalab
- details and day-to-day information are given by email (so, be sure you are on the mailing list)

Chapter 1 : Introduction, Bandits, and Combination of Experts for time series prediction

Chapter 2 : Learning dynamics (Bellman equation, Dynamic Programming, Monte Carlo, Temporal Difference(0), Q-learning, Sarsa)

Chapter 3 : Learning dynamics II (Eligibility traces, TD(lambda), generalization and function approximation, example with Atari player)

Chapter 4 : Learning dynamics III (policy gradient), Monte Carlo Tree Search (minimax trees, alpha-beta pruning, Upper Confidence Tree, applied to Go with CrazyStone/MoGo/AlphaGo)

Chapter 5 : Entropy

Chapter 6 : Compression/Prediction/Generation equivalence

Chapter 7 : Kolmogorov complexity

Chapter 8 : Fisher information

Chapter 9 : Reinforcement learning based on information theory (e.g., Phi-MDP, KL-UCB, AIXI), and robotics

PS: we're searching for students on various topics, from machine learning to computer vision!

Back to main page