Antoine Cornuéjols / Research
My research is organized around two directions.
- The study of the fundamentals of learning.
- The search for methods and solutions to real-world
problems.
Accordingly, I propose research topics for PhD theses
and Masters of Science (in French).
My HDR (Habilitation) (in French) gives an idea of my research interests over
the years.
1. Study of the fundamentals of learning
The current paradigm on machine learning is founded on statistical theory.
Mainly, it supposes that a learner selects among a set of hypotheses the one
that minimizes the error measured on the set of learning data. The question
is then: under which condition this empirical risk minimization principle
leads to a good hypothesis that will perform well on as yet unseen data.
The statistical theory of learning relies on the assumption that the examples
are independently and identically distributed. Furthermore, it supposes that
the sole performance measure is linked to prediction errors of the candidate
hypotheses. The result of these assumptions is that this paradigm ignores :
- the information that can be present in the sequence of the
data (no teacher displays his teaching material in random order and with no regards
to the time intervals between lessons)
- the structure of the knowledge both already acquired by the
learner and the target knowledge
- the fact that prediction errors is only one facet of learning
performance. The understandability of the learned concepts, their degree
of (possible) integration with the knowledge already acquired, the resulting
increased readiness to solve new problems, all are important aspects
that should be part of a more realistic and useful measure of learning performance.
Therefore, my research activity is aimed at :
- A better understanding or sequencing effects in
learning (the
fact that different orders of data presentations can lead to different learning
results)
- A better characterization of incremental learning.
In particular, I explore in which ways ideas from the study of dynamical
systems (non Riemmanian
geometry) can be imported in the field on machine learning.
(For instance, I currently study
how one can obtain new learning capabilities by exploiting the non-commutative
property of general incremental learning (through the study of Lie brackets)).
- The search for new active learning principle (that are not
based exclusively on the importance sampling approach that pervade current
approaches)
- The search for the definition of richer performance criteria,
that incorporate in particular the problem solving activity of the learner
(a good learner does not only predict well, it is able to solve efficiently
a large class of problem and to adapt its problem solving methods to new
situations)
Another direction of my research deals with a more refined study of induction
than the one based purely on the statistical theory of learning. In particular
I am interested in a new paradigm where the properties of the exploration
strategy of the hypothesis space is taken into account. A recent line of works has indeed
exhibited the possible existence of phase transition phenomena that can dramatically
affect the exploration of the hypothesis space and lead to pathological results
in learning complex concepts (e.g. in inductive logic programming (ILP) and
in grammatical inference).
2. Study of real-world problems
It is both fruitful and pleasing to study real-world problems. It brings
interesting challenges to the current state of the art in machine learning,
while it is deeply satisfactory to participate in the solution of difficult
and important problems.
Currently, I am involved in various projects, including:
- Bioinformatics.
- Specially the study of microarray data to discover
the genes that are involved in various biological processes (cancerous tumors,
weak radioactivity, ...)
- The search for the most promising molecules in the context of specific
pathologies
- Real-time and incremental learning for self-adative distributed
multi-agent systems.
- Learning for robotics
Slides of invited talks :
(Remark : these slides are generally in French).
- To come (January, 5th, 2009. Séminaire de statistique) : "Tracking,
transduction and Co." (Définition
du tracking. Raisons de son efficacité. Optimisation de son efficacité et
notion d'information)
Sujets de stages proposés pour 2006-2007 :
(Remarque : vous pouvez également me proposer des sujets si vous pensez qu'ils
s'intègrent dans les thématiques de recherche présentées).
Sujets de stages passés :
(Exemples de sujets proposés les années passées).