Machine Learning II course:
Information Theory & Reinforcement Learning
General information:
program of the course : PDF version
reference books
a series of fun video examples of applications of reinforcement learning
course forum (questions and discussions)
the exercises are evaluated on the challenge platform codalab
details and day-to-day information are given by email (so, be sure you are on the mailing list)
*** Bring your laptop! ***
(click on 'details' to show/hide summary, exercises, references, etc.)
show all details
hide all details
Part I : Reinforcement Learning
Partial lecture notes for this part are available here .
Chapter 1 : Introduction, Bandits, and Combination of Experts for time series prediction
details
Chapter 2 : Learning dynamics (Bellman equation, Dynamic Programming, Monte Carlo, Temporal Difference(0), Q-learning, Sarsa)
details
Reference: chapters 3-6 from Sutton & Barto
Exercises:
Chapter 3 : Learning dynamics II (Eligibility traces, TD(lambda), generalization and function approximation, example with Atari player)
details
Chapter 4 : Learning dynamics III (policy gradient), Monte Carlo Tree Search (minimax trees, alpha-beta pruning, Upper Confidence Tree, applied to Go with CrazyStone/MoGo/AlphaGo)
details
Exercises:
References for function generalization / policy gradient:
References for MCTS and Go:
Part II : Information Theory
Chapter 5 : Entropy
details
Chapter 6 : Compression/Prediction/Generation equivalence
details
Chapter 7 : Kolmogorov complexity
details
Chapter 8 : Fisher information
details
[To be updated] Summary of the lesson (and incomplete, badly shaped, temporary PDF version )
References: from Jérémy Bensadon's notes of Yann Ollivier's lectures
[Natural gradient] talk 6 + [invariance] talks 4-5
[Newton as a special case of natural gradient] talk 5
[Universal coding] talk 2 + Cover & Thomas
[Parameter precision] talk 2
[Jeffrey's prior] talk 3
[BIC] talks 1+7
Part III : Reinforcement Learning using Information Theory, and other advanced topics
Chapter 9 : Reinforcement learning based on information theory (e.g., Phi-MDP, KL-UCB, AIXI), and robotics
details
PS: we're searching for students on various topics, from machine learning to computer vision!
Back to main page