Fatiha Saïs

Associate Professor (Maître de Conférences) - (LRI - Paris Sud University and CNRS)

Research interests (current and past)

  • Data and Schema Reconciliation
  • Semantic Services
  • Data Fusion
  • Reference Reconciliation
  • Semantic annotation
  • Flexible Querying of Uncertain Data



Reference Fusion

People that are involved: Sébastien Destercke and Rallou Thomopoulos

In this topic, we want to deal with the issue of data fusion, which arises once reconciliations between references have been determined. The objective of this task is to fusion the descriptions of references that refer to the same real world entity so as to obtain a unique representation. In order to deal with the problem of uncertainty in the values associated with the attributes, we have chosen to represent the results of the fusion of references in a formalism based on fuzzy sets. We indicate how the confidence degrees are computed. Finally we propose a representation in Fuzzy RDF, as well as its flexible querying by queries expressing users' preferences.


Reference Reconciliation

People that are involved: Nathalie Pernelle and Marie-Christine Rousset

The reference reconciliation problem consists in deciding whether different identifiers refer to the same data, i.e., correspond to the same world entity. Our reconciliation system exploits the semantics of a rich data model (named RDFS+), which extends RDFS by a fragment of OWL-DL and SWRL rules.

Firstly, we have studied an algorithm which allows inferring sure reconciliations and not reconciliations by exploiting the semantics of the RDFS+ Schema. In our Logical method for Reference Reconciliation, the semantics of the schema is translated into a set of logical rules of reconciliation, which are then used to infer sure decisions both of reconciliation and no reconciliation. In contrast with other approaches, the logical method has a precision of 100% by construction. First experiments show promising results for recall, and most importantly its significant increasing when rules are added. This shows the interest and the power of the generic and flexible approach of our logical metgod since it is quite easy to add rules to express constraints on the domain of interest. The method is based on the most recent recommendations of W3C for the Semantic Web (RDF, OWL-DL and SWRL). Therefore, it can be used for reconciling data in most of the applications based on the Semantic Web technologies.

Secondly, we have studied an algorithm which allows obtaining possible reconciliations by computing the similarity scores between references pairs. In this method we use numerical techniques like algorithms for string distance measuring [W. Cohen 2003]. We have studied a Numerical method for Reference Reconciliation which has been implemented and experimented on real data sets.



Semantic Annotation.

People that are involved: Hélène Gagliardi, Ollivier Haemmerlé, Nathalie Pernelle and Marie-Christine Rousset

This work aims at building automatically a thematic data warehouse composed of heterogeneous XML documents extracted from the Web. We focus on the data tables contained in these documents. In this work we have proposed an automatic ontology-based approach to enrich structured information semantically. In order to represent the result of the semantic enrichment, we have defined the SML (Semantic Markup Language) which is an XML format where the majority of tags and values are coming from the domain ontology.

This work has been tested on real data in the e.dot project and will be integrated in the software platform which is under creation in the WebContent project.


Global chain of documents processing from the acquisition to the semantic enrichment