Séminaire d'équipe(s) Large-scale Heterogeneous DAta and Knowledge

Refining Transitive and Pseudo-Transitive Relations at Web Scale
Shuai Wang

24 January 2022, 13:00 Salle/Bat : 455/PCRI-N
Contact :

Activités de recherche : Web data management

Résumé :

The publication of knowledge graphs on the Web in the form of RDF datasets, and the subsequent integration of such knowledge graphs are both essential to the idea of Linked Open Data.
Combining such knowledge graphs can result in undesirable graph structures and even in logical inconsistencies.
Refinement methods that can detect and repair such undesirable graph structures are therefore of crucial importance.
Existing refinement methods for knowledge graphs are often domain-specific, are limited to single relations (e.g. owl:sameAs), or are limited in scale.
We present a challenge consisting of a number of datasets of transitive and pseudo-transitive relations and hand-labeled gold standards, as well as baselines.
We introduce an efficient web-scale knowledge graph refinement algorithm that works for such relations. Our algorithm analyses the graph structure, and allows the use of weighting schemes to heuristically determine which possibly erroneous edges should be removed to make the graph cycle free. When compared against general-purpose graph algorithms that perform the same task, our algorithm removes the least amount of edges to make the graph of transitive relations cycle-free while maintaining a better precision in identifying erroneous edges as measured against a human gold-standard.