Français Anglais
Accueil Annuaire Plan du site
Home > Research results > Dissertations & habilitations
Research results
Ph.D de

Group : Learning and Optimization

Meta-Learning as a Markov Decision Process

Starts on 01/10/2016
Advisor : GUYON, Isabelle

Funding :
Affiliation : Université Paris-Sud
Laboratory : LRI- Amphithéâtre du Bâtiment 660

Defended on 19/12/2019, committee :
M. Nicolas Thiéry, Professeur, LRI, Université Paris-Sud, France | Président
Mme Cécile Capponi, Maîtresse de Conférences HDR, Université d'Aix-Marseille, France | Rapporteur
M. Daniel Silver, Professeur, Acadia University, Canada | Rapporteur
M. Hugo Jair Escalante, Professeur, Instituto Nacional de Astrofisica, Optica y Electronica, Mexique | Examinateur
M. Joaquin Vanschoren, Professeur, Eindhoven University of Technology, Pays-Bas | Examinateur

Research activities :

Abstract :
Machine Learning (ML) has enjoyed huge successes in recent years and an evergrowing number of real-world applications rely on it. However, designing promising algorithms for a specific problem still requires a huge human effort. Automated Machine Learning (AutoML) aims at taking the human out of the loop and develops machines that generate/recommend good algorithms for a given ML task. AutoML is usually treated as an algorithm/hyper-parameter selection problem, existing approaches include Bayesian optimization, evolutionary algorithms as well as reinforcement learning. Among them, auto-sklearn which incorporates meta-learning techniques in their search initialization, ranks consistently well in AutoML challenges. This observation oriented my research to the Meta-Learning domain, I then develop a novel framework based on Markov Decision Processes (MDP) and reinforcement learning (RL).
After a general introduction, my thesis work started with an in-depth analysis of the results of the AutoML challenge. This analysis then oriented my work towards meta-learning, leading me first to propose a formulation of AutoML as a recommendation problem, and ultimately to formulate a novel conceptualization of the problem as a MDP. In the MDP setting, the problem is brought back to filling up, as quickly and efficiently as possible, a meta-learning matrix S, in which lines correspond to ML tasks and columns to ML algorithms. A matrix element S(i,j) is the performance of algorithm j applied to task i. Searching efficiently for the best values in S allows us to identify quickly algorithms best suited to given tasks.
After reviewing the classical hyper-parameter optimization framework, I will introduce my first meta-learning approach, ActivMetaL, that combines active learning and collaborative filtering techniques to predict the missing values in S. Then, our latest research applies RL to the MDP problem we defined to learn an efficient policy to explore S. We call this approach REVEAL and propose an analogy with a series of toy games to help visualize agents' strategies to reveal information progressively.
The main results of my Ph.D. project are:
- HP/model selection: I have explored the Freeze-Thaw method and optimized the algorithm to enter the AutoML 2015-2016 challenge, achieving 3rd place in the final round.
- ActivMetaL: I have designed a new algorithm for active meta-learning and compared it with other baseline methods on real-world and artificial data. This study demonstrated that ActivMetaL is generally able to discover the best algorithm faster than baseline methods.
- REVEAL: I developed a new conceptualization of meta-learning as a MDP and put it into the more general framework of REVEAL games. With a master student intern, I developed agents that learn (with reinforcement learning) to predict the next best algorithm to be tried.
The work presented in my thesis is empirical in nature. Several real-world meta-datasets were used in this research, each of which corresponds to one score matrix S. Artificial and semi-artificial meta-datasets are also used. The results indicate that reinforcement learning is a viable approach to this problem, although much work remains to be done to optimize algorithms to make them scale to larger meta-learning problems.

Ph.D. dissertations & Faculty habilitations
Question Answering is a discipline which lies in between natural language processing and information retrieval domains. Emergence of deep learning approaches in several fields of research such as computer vision, natural language processing, speech recognition etc. has led to the rise of end-to-end models. In the context of GoASQ project, we investigate, compare and combine different approaches for answering questions formulated in natural language over textual data on open domain and biomedical domain data. The thesis work mainly focuses on 1) Building models for small scale and large scale datasets, and 2) Leveraging structured and semantic information into question answering models. Hybrid data in our research context is fusion of knowledge from free text, ontologies, entity information etc. applied towards free text question answering. The current state-of-the-art models for question answering use deep learning based models. In order to facilitate using them on small scale datasets on closed domain data, we propose to use domain adaptation. We model the BIOASQ biomedical question answering task dataset into two different QA task models and show how the Open Domain Question Answering task suits better than the Reading Comprehension task by comparing experimental results. We pre-train the Reading Comprehension model with different datasets to show the variability in performance when these models are adapted to biomedical domain. We find that using one particular dataset (SQUAD v2.0 dataset) for pre-training performs the best on single dataset pre-training and a combination of four Reading Comprehension datasets performed the best towards the biomedical domain adaptation. We perform some of the above experiments using large scale pre-trained language models like BERT which are fine-tuned to the question answering task. The performance varies based on the type of data used to pre-train BERT. For BERT pre-training on the language modelling task, we find the biomedical data trained BIOBERT to be the best choice for biomedical QA. Since deep learning models tend to function in an end-to-end fashion, semantic and structured information coming from expert annotated information sources are not explicitly used. We highlight the necessity for using Lexical and Expected Answer Types in open domain and biomedical domain question answering by performing several verification experiments. These types are used to highlight entities in two QA tasks which shows improvements while using entity embeddings based on the answer type annotations. We manually annotated an answer variant dataset for BIOASQ and show the importance of learning a QA model with answer variants present in the paragraphs. Our hypothesis is that the results obtained from deep learning models can further be improved using semantic features and collective features from different paragraphs for a question. We propose to use ranking models based on binary classification methods to better rank Top-1 prediction among Top-K predictions using these features, leading to an hybrid model that outperforms state-of-art-results on several datasets. We experiment with several overall Open Domain Question Answering models on QA sub-task datasets built for Reading Comprehension and Answer Sentence Selection tasks. We show the difference in performance when these are modelled as overall QA task and highlight the wide gap in building end-to-end models for overall question answering task.

The original manuscript conceptualizes the recent rise of digital platforms along three main dimensions: their nature of coordination devices fueled by data, the ensuing transformations of labor, and the accompanying promises of societal innovation. The overall ambition is to unpack the coordination role of the platform and where it stands in the horizon of the classical firm – market duality. It is also to precisely understand how it uses data to do so, where it drives labor, and how it accommodates socially innovative projects. I extend this analysis to show continuity between today’s society dominated by platforms and the “organizational society”, claiming that platforms are organized structures that distribute resources, produce asymmetries of wealth and power, and push social innovation to the periphery of the system. I discuss the policy implications of these tendencies and propose avenues for follow-up research.