Ph.D
Group : Parallel Systems
Synchronization and Fault-tolerance in Distributed Algorithms
Starts on 03/10/2011
Advisor : BEAUQUIER, Joffroy
[DELAET Sylvie]
Funding : contrat doctoral DIGITEO
Affiliation : Université Paris-Saclay
Laboratory : LRI - Parallélisme
Defended on 24/09/2014, committee :
Directeur de thèse :
- Joffroy Beauquier, professeur Paris Sud, LRI/ParSys
Co-encadrante :
- Sylvie Delaët, MdC HDR Paris Sud, LRI/Galac
Rapporteurs :
- Rachid Guerraoui, professeur, School of Computer and Communication Sciences (LPD), EPFL
- Luis Rodrigues, professeur, Departamento de Engenharia Informática, Universidade de Lisboa
Examinateurs :
- Christine Paulin, professeur université Paris-Sud, LRI/Vals
- Hugues Fauconnier, MdC HDR Paris VII, LIAFA/Algorithmique distribuée et graphes
Research activities :
Abstract :
n the first part of this thesis, we focus on a recent model, called population protocols, which describes large networks of tiny wireless mobile anonymous agents with very limited resources. The harsh constraints of the original model makes most of the classical problems of distributed algorithmics, such as data collection, consensus and leader election, either difficult to analyze or impossible to solve.
We first study the data collection problem, which mainly consists in transferring some values to a base station. By using a fairness assumption, known as cover times, we compute tight bounds on the convergence time of concrete protocols. Next, we focus on the problems of consensus and leader election. It is shown that these problems are impossible in the original model. To circumvent these issues, we augment the original model with oracles, and study their relative power. We develop by the way a formal framework general enough to encompass various sorts of oracles, as well as their relations.
In the second part of the thesis, we study the problem of state-machine replication in the more classical model of asynchronous message-passing communication. The Paxos algorithm is a famous (partial) solution to the state-machine replication problem which tolerates crash failures. Our contribution is the enhancement of Paxos in order to tolerate transient faults as well. Doing so, we define the notion of practically self-stabilizing replicated state-machine.