Machine Learning II course

Reference: chapters 3-6 from Sutton & Barto
Exercises:
- exercise on contextual bandits and Q-learning (Wumpus world)
- associated source code

Exercises:
- exercise on policy gradient (mountain car)
- associated source code
References for function generalization / policy gradient:
- chapters 7-8 from (the old version of) Sutton & Barto
- Playing Atari with Deep Reinforcement Learning, by DeepMind
References for MCTS and Go:
- [CrazyStone] Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search, by Rémi Coulom
- [MoGo] Modification of UCT with Patterns in Monte-Carlo Go, by Gelly, Wang, Munos & Teytaud
- [AlphaGo] Mastering the game of Go with deep neural networks and tree search, by DeepMind

summary of the lesson, and all associated documents
Exercices, and associated text dataset for next practical session
more information about the EEG experiment (brain-computer interface) shown during the lesson

[To be updated] Summary of the lesson (and incomplete, badly shaped, temporary PDF version)
References: from Jérémy Bensadon's notes of Yann Ollivier's lectures
- [Natural gradient] talk 6 + [invariance] talks 4-5
- [Newton as a special case of natural gradient] talk 5
- [Universal coding] talk 2 + Cover & Thomas
- [Parameter precision] talk 2
- [Jeffrey's prior] talk 3
- [BIC] talks 1+7

Machine Learning II course: