Ph.D de

Group : Databases

Data-intensive interactive workflows for visual analytics

Starts on 01/10/2008
Advisor : BENZAKEN, Véronique

Funding : AM
Affiliation : Université Paris-Sud
Laboratory :

Defended on 12/12/2011, committee :
Encadrants :
Véronique Benzaken, Professeur, Université de Paris-Sud 11
Jean-Daniel Fekete, Directeur de recherche, INRIA Saclay-Île-de-France
Ioana Manolescu, Directeur de recherche, INRIA Saclay-Île-de-France

Dominique Laurent, Professeur, Université de Cergy Pontoise
Guy Melançcon, Professeur, LaBRI, Université Bordeaux I

Alain Denise, Professeur, Université de Paris-Sud 11
Thérèse Libourel, Professeur, LIRMM, Université Montpellier II

Research activities :

Abstract :
The increasing amounts of electronic data of all forms, produced by humans (e.g. Web pages, structured
content such as Wikipedia or the blogosphere etc.) and/or automatic tools (loggers, sensors, Web services,
scientific programs or analysis tools etc.) leads to a situation of unprecedented potential for extracting
new knowledge, finding new correlations, or simply making sense of the data.

Visual analytics aims at combining interactive data visualization with data analysis tasks. Given the explosion
in volume and complexity of scientific data, e.g., associated to biological or physical processes or social
networks, visual analytics is called to play an important role in scientific data management.

Most visual analytics platforms, however, are memory-based, and are therefore limited in the volume of data handled.
Moreover, the integration of each new algorithm (e.g. for clustering) requires integrating it by hand into the platform.
Finally, they lack the capability to define and deploy well-structured processes where users with different roles
interact in a coordinated way sharing the same data and possibly the same visualizations.

This work is at the convergence of three research areas: information visualization, database query processing
and optimization, and workflow modeling. It provides two main contributions: (i) We propose a generic architecture
for deploying a visual analytics platform on top of a database management system (DBMS) (ii) We show how to propagate
data changes to the DBMS and visualizations, through the workflow process. Our approach has been implemented in a prototype
called EdiFlow, and validated through several applications. It clearly demonstrates that visual analytics applications can
benefit from robust storage and automatic process deployment provided by the DBMS while obtaining good performance and thus
it provides scalability. Conversely, it could also be integrated into a data-intensive scientific workflow platform in order
to increase its visualization features.

