Français Anglais
Accueil Annuaire Plan du site
Accueil > Evenements > Séminaires
Séminaire d'équipe(s) Artificial Intelligence and Inference Systems
Jorge Quiane: Managing Very Large Datasets in a Cloudy World


25 January 2012, 15:45 - 25 January 2012, 16:45
Salle/Bat : 445/PCRI-N
Contact :

Activités de recherche :

Résumé :
Nowadays, many enterprises and organizations are faced with large volumes of data that have to be analyzed in a per-day basis. In particular, scientific datasets are growing at unprecedented rates and are likely to continue growing to the order of Exabytes. These current needs of data management require applications to run over a large number of computing nodes. However, databases management systems (DBMS) have proven inefficient to deal with very large datasets as well as to scale out to a large number of computing nodes. In this context, MapReduce and the Cloud computing are two alternative technologies that respond to this challenge. While MapReduce allows enterprises, organizations, and researchers to easily process very large volumes of data, the Cloud provides the required computing infrastructure to scale applications out to a large number of computing nodes. The beauty of these approaches are their ease-to-use and almost-free-admin cost properties. However, this simplicity comes at a price: the performance of MapReduce applications in the Cloud often do not match the one of a well-configured parallel DBMS. In this talk, we present some of the main features that allow DBMS to achieve orders of magnitude better performance than MapReduce applications. Then, we analyze how our Hadoop++ project allows MapReduce applications to match DBMS performance in the Cloud. We also discussed the design choices we made in the Hadoop++ project in order to preserve the ease-of-use and the almost-free-admin cost of MapReduce applications in the Cloud. Finally, we conclude this talk by discussing some of the challenges imposed by the Cloud to achieve data management efficiently.

Pour en savoir plus :
Séminaires
Programming computing media (reporté)
Combinatorics
Friday 18 September 2020 - 14:30
Salle : 445 - PCRI-N
Frédéric Gruau .............................................

forum-dev Continuous Integration
Friday 05 June 2020 - 10:00
Salle : 0 - 650
Erik Bray .............................................

Large-scale Spectral Clustering for GPU-based Plat
High-performance computing
Tuesday 24 March 2020 - 10:30
Salle : 465 - PCRI-N
Guanlin He .............................................

Recherche Opérationnelle à Google
Stochastic Combinatorial Optimization
Thursday 12 March 2020 - 14:30
Salle : 445 - PCRI-N
Laurent Perron .............................................

Forum dev-LRI
Wednesday 05 February 2020 - 14:00
Salle : 455 - PCRI-N
Erik Bray .............................................