Français Anglais
Accueil Annuaire Plan du site
Accueil > Evenements > Séminaires
Entity Resolution for Big Data
Themis Palpanas

19 December 2012, 11h00 - 19 December 2012, 13h00
Salle/Bat : 445/PCRI-N
Contact : jesus.camacho-rodriguez@lri.fr

Activités de recherche :

Résumé :
Highly heterogeneous data have boomed during the last decade, due to their largely distributed way of production: corporations of any size, individual users as well as automatic extraction tools have contributed a constantly increasing volume of heterogeneous and noisy information. Entity Resolution (ER) helps to reduce the corresponding entropy by identifying those pieces of information that refer to the same real-world objects.

Typically, blocking techniques are used to scale ER to large volumes of data. However, most of these techniques rely on schema information and are inapplicable to highly heterogeneous settings. Our work goes beyond existing blocking techniques, by introducing a novel methodology that is inherently crafted for voluminous, highly heterogeneous, and noisy data collections.

At the core of our approach lie three independent, but complementary steps: block-building (using redundant block assignments for effectiveness), meta-blocking (reducing the number of necessary blocks), and block processing (increasing efficiency of ER operations). Our experimental evaluation with three large-scale, real-world data sets demonstrates that our methodology can successfully handle very large and highly heterogeneous datasets, achieving an excellent balance between effectiveness and efficiency.

Pour en savoir plus :
Séminaires
Knowledge Graph Refinement based on Triplet BERT-N
Gestion de données du Web
Monday 29 November 2021 - 13h00
Salle : 455 - PCRI-N
Armita Khajeh Nassiri .............................................

A Hyper-graph Approach for Computing EL+-Ontology
Raisonnement automatique
Monday 15 November 2021 - 13h00
Salle : 445 - PCRI-N
Hui Yang .............................................

Semantic approaches to predict the presence of asb
Intégration de données et de connaissances
Monday 08 November 2021 - 13h00
Salle : 455 - PCRI-N
Thamer Mecharnia .............................................

Pierre Andrieu - Agrégation de classements pour le
Thursday 21 October 2021 - 00h00
Salle : 435 - PCRI-N
.............................................

A counting argument for graph colouring
Théorie des graphes
Friday 08 October 2021 - 11h00
Salle : 445 - PCRI-N
Francois Pirot .............................................