Français Anglais
Accueil Annuaire Plan du site
Home > Research results > Dissertations & habilitations
Research results
Ph.D de

Group : Large-scale Heterogeneous DAta and Knowledge

Traitement efficace de requêtes sparql avec extensions olap pour entrepôts RDF

Starts on 01/09/2011
[GOASDOUE François]

Funding :
Affiliation : Université Paris-Sud
Laboratory : LRI

Defended on 22/09/2014, committee :
Directrice de la thèse :
- Mme. Ioana Manolescu, Directrice de Recherche, Inria et Université Paris-Sud

Co-encadrant :
- M. François Goasdoué, Professeur, Université Rennes 1

Rapporteurs :
- M. Alon Halevy, Professeur, Google Research
- M. Frank van Harmelen, Professeur, Vrije Universiteit Amsterdam
- M. Frank van Harmelen, Professeur, Vrije Universiteit Amsterdam

Examinateurs :
- M. Serge Abiteboul, Directeur de Recherche, Inria et ENS Cachan
- Mme. Christine Froidevaux, Professeur, Université Paris-Sud
- M. Philippe Rigaux, Professeur, Conservatoire National des Arts et Métiers

Research activities :

Abstract :
The utility and relevance of data lie in the information that can be extracted from it. The high rate of data publication and its increased complexity, for instance the heterogeneous, self-describing Semantic Web data, motivate the interest in efficient techniques for data manipulation. In this thesis we leverage mature relational data management technology for querying Semantic Web data.

The first part focuses on query answering over data subject to RDFS constraints, stored in relational data management systems. The implicit information resulting from RDF reasoning is required to correctly answer such queries. We introduce the database fragment of RDF, going beyond the expressive power of previously studied fragments. We devise novel techniques for answering Basic Graph Pattern queries within this fragment, exploring the two established approaches for handling RDF semantics, namely graph saturation and query reformulation.
In particular, we consider graph updates within each approach and propose a method for incrementally maintaining the saturation. We experimentally study the performance trade-offs of our techniques, which can be deployed on top of any relational data management engine.

The second part of this thesis considers the new requirements for data analytics tools and methods emerging from the development of the Semantic Web. We fully redesign, from the bottom up, core data analytics concepts and tools in the context of RDF data. We propose the first complete formal framework for warehouse-style RDF analytics. Notably, we define analytical schemas tailored to heterogeneous, semantic-rich RDF graphs, analytical queries which (beyond relational cubes) allow flexible querying of the data and the schema as well as powerful aggregation and OLAP-style operations. Experiments on a fully-implemented platform demonstrate the practical interest of our approach.

Ph.D. dissertations & Faculty habilitations
The original manuscript conceptualizes the recent rise of digital platforms along three main dimensions: their nature of coordination devices fueled by data, the ensuing transformations of labor, and the accompanying promises of societal innovation. The overall ambition is to unpack the coordination role of the platform and where it stands in the horizon of the classical firm – market duality. It is also to precisely understand how it uses data to do so, where it drives labor, and how it accommodates socially innovative projects. I extend this analysis to show continuity between today’s society dominated by platforms and the “organizational society”, claiming that platforms are organized structures that distribute resources, produce asymmetries of wealth and power, and push social innovation to the periphery of the system. I discuss the policy implications of these tendencies and propose avenues for follow-up research.