Français Anglais
Accueil Annuaire Plan du site
Accueil > Evenements > Séminaires
Séminaire d'équipe(s) BD
Processing XML Queries and Updates on Map/Reduce Clusters
Dario Colazzo

19 April 2013, 14h30 - 19 April 2013, 16h00
Salle/Bat : 435/PCRI-N
Contact : dario.colazzo@lri.fr

Activités de recherche :

Résumé :
Very large XML documents are generated and processed in several contexts, in particular in those involving scientific data and logs. In order to process such large documents we have designed and implemented techniques based on data partitioning for the evaluation of XQuery queries and updates on Map/Reduce clusters.

The proposed technique applies when queries and updates are iterative, i.e., they iterate the same query/update operations on a sequence of subtrees of the input document. We have developed schema-less, static analysis techniques to i) recognize iterative queries/updates, and ii) extract path information to be used for data partitioning purposes. Our system exploits both dynamic and static data partitioning to distribute the processing load among the machines of a Map/Reduce cluster. To boost the I/O performance across the distributed file system, our system uses EXI compression at each stage of the computation, from data partitioning to query/update execution.

After an introduction to the main techniques behind our system, a demonstration will show its abilities in dealing with complex workloads and large documents.

Pour en savoir plus :
Séminaires
A Family of Tractable Graph Distances
Gestion de données du Web
Wednesday 04 July 2018 - 10h30
Salle : 465 - PCRI-N
Stratis Ioannidis .............................................

Binary pattern of length greater than 14 are abeli
Combinatoire
Friday 29 June 2018 - 14h30
Salle : 445 - PCRI-N
Matthieu Rosenfeld .............................................

Distributionally Robust Optimization with Principa
Optimisation combinatoire et stochastique
Friday 29 June 2018 - 11h00
Salle : 455 - PCRI-N
Dr. Jianqiang Cheng .............................................

Caractérisation de réseaux égocentrés par l'énumér
Friday 15 June 2018 - 14h30
Salle : 455 - PCRI-N
Raphaël Charbey .............................................

DATA VERACITY ASSESSMENT: HOW A-PRIORI KNOWLEDGE E
Intégration de données et de connaissances
Friday 15 June 2018 - 14h00
Salle : 445 - PCRI-N
Valentina Beretta .............................................