Français Anglais
Accueil Annuaire Plan du site
Accueil > Evenements > Séminaires
Salle 445 - Resolving Entities in the Web of Data
Vassilis Christophides

05 May 2017, 16:30
Salle/Bat : /
Contact :

Activités de recherche : Integration of Data and Knowledge

Résumé :
Over the past decade, numerous knowledge bases (KBs) have
been built to power a new generation of Web applications that provide
entity-centric search and recommendation services. These KBs offer
comprehensive, machine-readable descriptions of a large variety of
real-world entities (e.g., persons, places, products, events) published
on the Web as Linked Data (LD). Even when derived from the same data
source (e.g., a Wikipedia entry), KBs such as DBpedia, YAGO2, or
Freebase may provide multiple, non-identical descriptions for the same
real-world entities. This is due to the different information extraction
tools and curation policies employed by KBs, resulting to complementary
and sometimes conflicting entity descriptions. Entity resolution (ER)
aims to identify different descriptions that refer to the same
real-world entity, and emerges as a central data-processing task for an
entity-centric organization of Web data. ER is needed to enrich
interlinking of data elements describing entities, even by
third-parties, so that the Web of data can be accessed by machines as a
global data space using standard languages, such as SPARQL. ER can also
facilitate an automated KB construction by integrating entity
descriptions from legacy KBs with Web content published as HTML documents.
ER has attracted significant attention from many researchers in
information systems, database and machine-learning communities. The
objective of this lecture is to present the new ER challenges stemming
from the Web openness in describing, by an unbounded number of KBs, a
multitude of entity types across domains, as well as the high
heterogeneity (semantic and structural) of descriptions, even for the
same types of entities. The scale, diversity and graph structuring of
entity descriptions published according to the LD paradigm challenge the
core ER tasks, namely, (i) how descriptions can be effectively compared
for similarity and (ii) how resolution algorithms can efficiently filter
the candidate pairs of descriptions that need to be compared.
In a multi-type and large-scale entity resolution, we need to examine
whether two entity descriptions are somehow (or near) similar without
resorting to domain- specific similarity functions and/or mapping rules.
Furthermore, the resolution of some entity descriptions might influence
the resolution of other neighbourhood descriptions. This setting clearly
goes beyond deduplication (or record linkage) of collections of
descriptions usually referring to a single entity type that slightly
differ only in their attribute values. It essentially requires
leveraging similarity of descriptions both on their content and
structure. It also forces us to revisit traditional ER workfows
consisting of separate indexing (for pruning the number of candidate
pairs) and matching (for resolving entity descriptions) phases.

Pour en savoir plus :
Séminaires
Salle 465 - Direct-Coupling Analysis of nucleotide
Thursday 18 May 2017 - 16:00
Salle : ()
Martin Weigt .............................................

Salle 445 - Resolving Entities in the Web of Data
Integration of Data and Knowledge
Friday 05 May 2017 - 16:30
Salle : ()
Vassilis Christophides .............................................

2017-04-28
Graph Theory
Friday 28 April 2017 - 14:30
Salle : ()
Evelyne Flandrin .............................................

Salle 435 - Recommandation de Contenu dans les Pla
Web data management
Friday 28 April 2017 - 14:00
Salle : ()
Cédric Du Mouza .............................................

Salle 445 - Exemplar queries on documents: Finding
Web data management
Monday 24 April 2017 - 11:00
Salle : ()
Yannis Velegrakis .............................................