Français Anglais
Accueil Annuaire Plan du site
Home > Research results > Dissertations & habilitations
Research results
Ph.D de

Ph.D
Group : Learning and Optimization

Multi-Objective Sequential Decision Making

Starts on 23/09/2010
Advisor : SEBAG, Michèle
[SCHOENAUER Marc]

Funding :
Affiliation : Université Paris-Sud
Laboratory : LRI

Defended on 11/07/2014, committee :
Directrice de Thèse :

- Michèle Sebag, DR CNRS, Université Paris-Sud, LRI/TAO

Examinateurs :

- Yann Chevaleyre, Professeur, Université Paris 13, LIPN
- Cécile Germain-Renaud, Professeur, Université Paris-Sud, LRI/TAO
- Dominique Gouyou-Beauchamps, Professeur, Université Paris-Sud, LRI

Rapporteurs :

- Jin-Kao Hao, Professeur, Université d'Angers, LERIA
- Philippe Preux, Professeur, Université de Lille 3, INRIA Lille

Research activities :

Abstract :
This thesis is concerned with multi-objective sequential decision making (MOSDM).
The motivation is twofold. On the one hand, many decision problems in the domains of e.g., robotics, scheduling or games, involve the optimization of sequences of decisions. On the other hand, many real-world applications are most naturally formulated in terms
of multi-objective optimization (MOO).

The proposed approach extends the well-known Monte-Carlo tree search (MCTS) framework to the MOO setting, with the goal of discovering several optimal sequences of decisions through growing a single search tree. The main challenge is to propose a new reward, able to guide the exploration of the tree although the MOO setting does not enforce a total order among solutions.

The main contribution of the thesis is to propose and experimentally study two such rewards, inspired from the MOO literature and assessing a solution with respect to the archive of previous solutions (Pareto archive): the hypervolume indicator and the Pareto dominance reward.

The study shows the complementarity of these two criteria. The hypervolume indicator su ers from its known computational complexity; however the proposed extension thereof provides ne-grained information abut the quality of solutions with respect to the current archive. Quite the contrary, the Pareto-dominance reward is linear but it provides increasingly rare information.

Proofs of principle of the approach are given on arti cial problems and challenges, and con firm the merits of the approach. In particular, MOMCTS is able to discover policies lying in non-convex regions of the Pareto front, contrasting with the state of the art: existing Multi-Objective Reinforcement Learning algorithms are based on linear scalarization and thus fail to sample such non-convex regions.

Finally MOMCTS honorably competes with the state of the art on the 2013 MOPTSP competition.

Ph.D. dissertations & Faculty habilitations
APPRENTISSAGE ET OPTIMISATION SUR LES GRAPHES


ANALYSE DE DONNéES MULTI-MODALES POUR LES PATHOLOGIES COMPLEXES PAR LA CONCEPTION ET L’IMPLéMENTATION DE PROTOCOLES REPRODUCTIBLES ET RéUTILISABLES


DESIGNING INTERACTIVE TOOLS FOR CREATORS AND CREATIVE WORK
Creative work has been at the core of research in Human-Computer Interaction (HCI). I describe the results of a series of studies that look at how creators work, where creators include artists with years of professional practice, as well as learners, or novices and casual makers. My research focuses on three creation activities: drawing, physical modeling, and music composition. For these activities, I examine how artists switch between representations and how these representations evolve throughout their creative process, from early sketches to fine-grained forms or structured vocabularies. I present interactive systems that enrich their workflow (i) by extending their computer tools with physical user interfaces, or (ii) by making physical materials interactive. I also argue that sketch-based representations can allow for user interfaces that are more personal and less rigid. My presentation will reflect on lessons and limitations of this work and discuss challenges for future design-support tools.