Français Anglais
Accueil Annuaire Plan du site
Accueil > Evenements > Séminaires
Séminaire d'équipe(s) IASI
Jorge Quiane: Managing Very Large Datasets in a Cloudy World


25 January 2012, 15h45 - 25 January 2012, 16h45
Salle/Bat : 445/PCRI-N
Contact :

Activités de recherche :

Résumé :
Nowadays, many enterprises and organizations are faced with large volumes of data that have to be analyzed in a per-day basis. In particular, scientific datasets are growing at unprecedented rates and are likely to continue growing to the order of Exabytes. These current needs of data management require applications to run over a large number of computing nodes. However, databases management systems (DBMS) have proven inefficient to deal with very large datasets as well as to scale out to a large number of computing nodes. In this context, MapReduce and the Cloud computing are two alternative technologies that respond to this challenge. While MapReduce allows enterprises, organizations, and researchers to easily process very large volumes of data, the Cloud provides the required computing infrastructure to scale applications out to a large number of computing nodes. The beauty of these approaches are their ease-to-use and almost-free-admin cost properties. However, this simplicity comes at a price: the performance of MapReduce applications in the Cloud often do not match the one of a well-configured parallel DBMS. In this talk, we present some of the main features that allow DBMS to achieve orders of magnitude better performance than MapReduce applications. Then, we analyze how our Hadoop++ project allows MapReduce applications to match DBMS performance in the Cloud. We also discussed the design choices we made in the Hadoop++ project in order to preserve the ease-of-use and the almost-free-admin cost of MapReduce applications in the Cloud. Finally, we conclude this talk by discussing some of the challenges imposed by the Cloud to achieve data management efficiently.

Pour en savoir plus :
Séminaires
Programming computing media (reporté)
Combinatoire
Friday 18 September 2020 - 14h30
Salle : 445 - PCRI-N
Frédéric Gruau .............................................

forum-dev Continuous Integration
Friday 05 June 2020 - 10h00
Salle : 0 - 650
Erik Bray .............................................

Large-scale Spectral Clustering for GPU-based Plat
Calcul à haute performance
Tuesday 24 March 2020 - 10h30
Salle : 465 - PCRI-N
Guanlin He .............................................

Recherche Opérationnelle à Google
Optimisation combinatoire et stochastique
Thursday 12 March 2020 - 14h30
Salle : 445 - PCRI-N
Laurent Perron .............................................

Forum dev-LRI
Wednesday 05 February 2020 - 14h00
Salle : 455 - PCRI-N
Erik Bray .............................................