Français Anglais
Accueil Annuaire Plan du site
Home > Research results > Dissertations & habilitations
Research results
Ph.D de

Ph.D
Group : Learning and Optimization

Amélioration du MCTS par des techniques d'apprentissage supervisé et applications dans le domaine de l'énergie

Starts on 01/09/2010
Advisor : TEYTAUD, Olivier

Funding :
Affiliation : Université Paris-Sud
Laboratory : LRI-TAO

Defended on 30/09/2013, committee :
Philippe Preux, Professeur Univ. Lille 3, rapporteur
- Nicolas Sabouret, professeur Supelec, LIMSI/AMI
- Tristan Cazenave, Professeur Univ. Paris Dauphine, examinateur
- Olivier Buffet, CR Inria, LORIA-Maia, Vandœuvre-lès-Nancy, examinateur
- Simon Lucas, Professor, Essex University, examinateur
- Marc Schoenauer, DR Inria, LRI/Tao, examinateur
- Louis Wehenkel, Professeur, Univ. Montefiore, Belgique, rapporteur

Research activities :

Abstract :
We consider the class of problems where an agent is making a series of actions, navigating from state to state, and gathering rewards on the way. We look at the general case, where actions and states can be infinite, and where the transition associated to a given couple action-state can be stochastic.

One of the motivating application is the problem of energy stock management. This problem is the one faced by an agent operating energy production facilities, along with energy stocks (batteries, water stocks...), under uncertainty (fuel prices, energy demand, weather, etc). State of the art methods need a simplified version of the model to run properly.
MCTS does not need to simplify the model of this problem, and could thus be a direction to improve existing solutions to the energy stock management problem.

In this work, we extend the vanilla version of MCTS (the one that has been successful on the game of Go, with finite domain and deterministic transitions) to make it consistent on continuous domains and stochastic transitions.
We also present ways to increase the convergence speed of continuous MCTS, and show some empirical results on simulations of the stock management problem.
We show some other applications of continuous MCTS, to POMDP (with an application to Mine Sweeper), and in a meta-bandit framework.
Finally we introduce a framework to easily include expert knowledge in MCTS, making it a way to improve existing policies.

Continuous MCTS is interesting in the sense that its main restriction is that it is a model based method (it requires a generative model of the problem). Many other methods require things that MCTS does not need, and that are not always easy to guarantee. These assumptions include convexity assumption (of the action space and/or of the cost function), explicit knowledge of the action space (to discretize it for example), Markovian random process, etc.
Continuous MCTS is also an anytime algorithm (it can be interrupted at any time and still return a suboptimal but valid solution).

Ph.D. dissertations & Faculty habilitations
APPRENTISSAGE ET OPTIMISATION SUR LES GRAPHES


ANALYSE DE DONNéES MULTI-MODALES POUR LES PATHOLOGIES COMPLEXES PAR LA CONCEPTION ET L’IMPLéMENTATION DE PROTOCOLES REPRODUCTIBLES ET RéUTILISABLES


DESIGNING INTERACTIVE TOOLS FOR CREATORS AND CREATIVE WORK
Creative work has been at the core of research in Human-Computer Interaction (HCI). I describe the results of a series of studies that look at how creators work, where creators include artists with years of professional practice, as well as learners, or novices and casual makers. My research focuses on three creation activities: drawing, physical modeling, and music composition. For these activities, I examine how artists switch between representations and how these representations evolve throughout their creative process, from early sketches to fine-grained forms or structured vocabularies. I present interactive systems that enrich their workflow (i) by extending their computer tools with physical user interfaces, or (ii) by making physical materials interactive. I also argue that sketch-based representations can allow for user interfaces that are more personal and less rigid. My presentation will reflect on lessons and limitations of this work and discuss challenges for future design-support tools.