I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, J. Dongarra
Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices.
To appear in Parallel Computing (2018), PDF File.
Gary W. Howell and Marc Baboulin Iterative Solution of Sparse Linear Least Squares using LU Factorization.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2018). ACM digital library, pp 47-53 (2018),
PDF File.
C. Allouche, M. Baboulin, T. Goubault de Brugière, B. Valiron Reuse method for quantum circuit synthesis.
Proceedings of the International Conference of Applied Mathematics, Modeling and Computational Science (AMMCS 2017), PDF File.
Evan Coleman, Aygul Jamal, M. Baboulin, Amal Khabou, Masha Sosonkina A Comparison of Soft-Fault Error Models in the Parallel Preconditioned Flexible GMRES. Proceedings of the 12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017).
To appear in Lecture Notes in Computer Science, Springer-Verlag (2017),
PDF File.
M. Baboulin, J. Dongarra, A. Rémy, S. Tomov, I. Yamazaki Solving Dense Symmetric Indefinite Systems using GPUs. Concurrency and Computation: Practice and Experience , Vol. 29, No 9 (2017),
PDF File.
H. Anzt, M. Baboulin, J. Dongarra, Y. Fournier, F. Hulsemann, A. Khabou, Y. Wang Accelerating the conjugate gradient algorithm with GPU in CFD Simulations.
Proceedings of the International Conference on Vector and Parallel Processing (VecPar 2016).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 10150, pp. 35-43 (2016),
PDF File.
I. Masliah, M. Baboulin, J. Falcou Meta-programming and multi-stage programming for GPGPUs. Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC 2016).
IEEE Xplore Digital Library, pp. 369-376 (2016),
PDF File.
A. Jamal, M. Baboulin, A. Khabou, M. Sosonkina A hybrid CPU/GPU approach for the parallel algebraic recursive multilevel solver pARMS. Proceedings of the 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2016).
IEEE Xplore Digital Library, pp. 411-416 (2016),
PDF File.
I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, J. Dongarra
High-Performance Matrix-Matrix Multiplications of Very Small Matrices.
Proceedings of Euro-Par 2016.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 9833, pp. 659-671 (08/2016), PDF File.
A. Abdelfattah, M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, S. Tomov
High-Performance Tensor Contractions for GPUs. Proceedings of the International Conference on Computational Science, ICCS 2016. Procedia Computer Science, Elsevier, Vol. 80, pp. 108-118 (06/2016),
PDF File.
M. Baboulin, J. Dongarra, A. Rémy, S. Tomov, I. Yamazaki Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures. Proceedings of the 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 9573, pp. 86-95 (2016),
PDF File.
G. W. Howell, M. Baboulin LU Preconditioning for Overdetermined Sparse Least Squares Problems. Proceedings of the 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 9573, pp. 128-137(2016),
PDF File.
M. Baboulin, A. Jamal, M. Sosonkina Using Random Butterfly Transformations in Parallel Schur Complement-Based Preconditioning. Proceedings of the 2015 Federated Conference on Computer Science and Information Systems (FedCSIS 2015).
Vol. 5, pp. 649-654 (2015),
PDF File.
M. Baboulin, A. Khabou, A. Rémy A randomized LU-based solver using GPU and Intel Xeon Phi accelerators. Proceedings of the Euro-Par 2015 workshop
``HeteroPar - Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Platforms''. Lecture Notes in Computer Science, Springer-Verlag, Vol. 9523, pp. 175-184 (2015),
PDF File.
I. Masliah, M. Baboulin, J. Falcou Metaprogramming dense linear algebra solvers.
Applications to multi and many-core architectures. Proceedings of the 13th IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA-15).
IEEE Xplore Digital Library, Vol. 3, pp. 69-76 (2015),
PDF File.
M. Baboulin, J. Dongarra, R. Lacroix Computing least squares condition numbers on hybrid multicore/GPU systems.
Interdisciplinary Topics in Applied Mathematics, Modeling and Computational Science, Vol. 117, pp. 35-41 (2015),
PDF File.
M. Baboulin, X. S. Li, F-H. Rouet Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods.
Proceedings of the International Conference on Vector and Parallel Processing (VecPar 2014).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 8969, pp. 135-144 (2014),
PDF File.
G. Fursin, R. Miceli, A. Lokhmotov, M. Gerndt, M. Baboulin, A. Malony, Z. Chamski, D. Novillo, D. Del Vento Collective mind: Towards practical and collaborative auto-tuning. Scientific Programming, IOS Press, Vol. 22, No 4, pp. 309-329 (2014),
PDF File.
M. Baboulin, D. Becker, G. Bosilca, A. Danalis, J. Dongarra An efficient distributed randomized algorithm for solving large dense symmetric indefinite linear systems. Parallel Computing , Vol. 40, No 7, pp. 213-223 (2014),
PDF File.
Y. Wang, M. Baboulin, K. Rupp, O. Le Maître, Y. Fraigneau Solving 3D incompressible Navier-Stokes equations on hybrid CPU/GPU systems. Proceedings of the 22nd High Performance Computing Symposium (HPC'14). ACM digital library, article 12 (2014),
PDF File.
Adrien Rémy, M. Baboulin, M. Sosonkina, B. Rozoy Locality optimization on a NUMA architecture for hybrid LU factorization.
Proceedings of the International Conference on Parallel Computing, PARCO 2013. Advances in Parallel Computing, IOS Press, Vol. 25, pp. 153-162 (2014),
PDF File.
M. Baboulin, S. Gratton, R. Lacroix, A. J. Laub Statistical estimates for the conditioning of linear least squares problems. Proceedings of the 10th International Conference on Parallel Processing and Applied Mathematics, PPAM 2013.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 8384, pp. 124-133 (2014),
PDF File.
Y. Wang, M. Baboulin, J. Dongarra, J. Falcou, Y. Fraigneau, O. Le Maître A parallel solver for incompressible fluid flows. Proceedings of the International Conference on Computational Science, ICCS 2013. Procedia Computer Science, Elsevier, Vol. 18, pp. 439-448 (06/2013),
PDF File.
M. Baboulin, J. Dongarra, J. Herrmann, S. Tomov Accelerating linear system solutions using randomization techniques. ACM Transactions on Mathematical Software (TOMS),Vol. 39, No 2 (2013),
PDF File.
M. Baboulin, S. Donfack, J. Dongarra, L. Grigori, A. Rémy, S. Tomov A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines.
Inria Research Report 7854
(02/2012). Proceedings of the International Conference on Computational Science, ICCS 2012.
Procedia Computer Science, Elsevier, Vol. 9, pp. 17-26 (2012),
PDF File.
M. Baboulin, D. Becker, J. Dongarra A parallel tiled solver for dense symmetric indefinite systems on multicore architectures.
Inria Research Report 7762
(12/2011),
also appeared as LAPACK Working Note 261. Proceedings of IEEE International Parallel & Distributed Processing Symposium, IPDPS 2012,
PDF File.
D. Becker, M. Baboulin, J. Dongarra Reducing the amount of pivoting in symmetric indefinite systems.
Inria Research Report 7621
(05/2011),
University of Tennessee Technical Report ICL-UT-11-06. Proceedings of the 9th International Conference on Parallel Processing and Applied Mathematics, PPAM 2011.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 7203, pp. 133-142 (2012),
PDF File.
M. Baboulin, S. Gratton A contribution to the conditioning of the total least squares problem.
Inria Research Report 7488
(12/2010),
also appeared as LAPACK Working Note 236. SIAM Journal on Matrix Analysis and Applications,Vol. 32, No 3, pp. 685-699 (2011),
PDF File.
S. Tomov, J. Dongarra, M. Baboulin Towards dense linear algebra for hybrid GPU accelerated manycore systems. Parallel Computing , Vol. 36, No 5&6, pp. 232-240 (2010),
PDF File.
M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou,
P. Luszczek, S. Tomov Accelerating scientific computations with mixed precision algorithms. Computer Physics Communications , Vol. 180, No 12, pp. 2526-2533 (2009),
PDF File.
M. Baboulin, J. Dongarra, S. Gratton, J. Langou Computing the conditioning of the components of a linear least squares solution. Numerical Linear Algebra with Applications , Vol. 16, No7, pp. 517-533 (2009), PDF File.
M. Baboulin, S. Gratton Using dual techniques to derive componentwise and mixed condition numbers
for a linear function of a linear least squares solution. BIT Numerical Mathematics , Vol. 49, No1, pp. 3-19 (2009),
PDF File.
M. Baboulin, J. Dongarra, S. Tomov Some issues in dense linear algebra for multicore and special
purpose architectures. Proceedings of the 9th International Workshop on State-of-the-Art
in Scientific and Parallel Computing (PARA'08) .
Lecture Notes in Computer Science, vol. 6126-6127, Springer-Verlag (2008),
PDF File.
M. Baboulin, L. Giraud, S. Gratton, J. Langou Parallel tools for solving incremental dense least squares problems. Application to space geodesy. Journal of Algorithms and Computational Technology, Vol. 3, No 1, pp. 117-133 (2009),
PDF File.
M. Baboulin, L. Giraud, S. Gratton, J. Langou A distributed packed storage for large dense parallel in-core calculations.
Concurrency and Computation: Practice and Experience, Vol. 19, No 4, pp. 483-502 (2007),
PDF File.
M. Arioli, M. Baboulin, S. Gratton A partial condition number for linear least squares problems.
SIAM Journal on Matrix Analysis and Applications,Vol. 29, No 2, pp. 413-433 (2007),
PDF File.
M. Baboulin, L. Giraud, S. Gratton A parallel distributed solver for large dense symmetric systems:
applications to geodesy and electromagnetism problems.
International Journal of
High Performance Computing Applications, Vol. 19, No 4, pp. 353-363 (2005),
PDF File.
Theses
Title: Fast and reliable solutions for numerical linear algebra solvers in high-performance computing. Habilitation à Diriger des Recherches (HDR) from
University Paris-Sud, defended December 5, 2012.
Committee: J.C. Bajard (Université Paris 6), P. Dague (Université Paris-Sud), F. Desprez (Inria/Ecole Normale Supérieure de Lyon, referee), Jack Dongarra (University of Tennessee, USA), S. Gratton (ENSEEIHT), P. Langlois (Université de Perpignan, referee), J. Roman (Universitat Politècnica de València, Spain, referee), B. Rozoy (Université Paris-Sud).
HDR dissertation
Title: Solving large dense linear least squares problems on parallel
distributed computers. Application to the Earth's gravity field computation. Ph.D. in Computer Science from
Institut National Polytechnique de Toulouse, defended March 21 2006.
Committee: G. Balmino (CNES/CNRS), J. Dongarra (University of Tennessee, USA, referee), I.S. Duff (RAL/CERFACS), L. Giraud (ENSEEIHT), S. Gratton (CERFACS), N.J. Higham (University of Manchester, UK, referee), J. Noailles (ENSEEIHT).
Ph.D. dissertation
This thesis was awarded the Léopold Escande Prize by
Institut National Polytechnique de Toulouse.
Conferences
Iterative Solution of Sparse Linear Least Squares using LU Factorization.
6th IMA Conference on Numerical Linear Algebra and Optimization, Birmingham, UK, June 27-29, 2018.
Using randomization in the solution of sparse linear systems.
Workshop on Recent Topics in High Performance Computing, Kagaku Kaikan, Tokyo, Japan, Sept. 21, 2017.
The story of the butterflies.
5th IMA Conference on Numerical Linear Algebra and Optimization, Birmingham, UK, Sept. 7-9, 2016.
LU preconditioning for overdetermined sparse least squares problems.
20th International Linear Algebra Society Conference (ILAS 2016), Leuven, Belgium, Jul. 11-15, 2016.
Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures.
SIAM Conference on Applied Linear Algebra, Atlanta, USA, Oct. 26-30, 2015.
Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures.
11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015), Krakow, Poland, Sept. 6-9, 2015.
Invited plenary speaker: The story of the butterflies.
High Performance Computing in Science and Engineering (HPCSE 2015), Solan, Czech Republic, May 25-28, 2015.
Using condition numbers to assess numerical quality in least squares HPC applications.
5th International Conference on Numerical Algebra and Scientific Computing (NASC 2014), Tongji University, Shanghai, P.R. China, Oct. 25-29, 2014.
Using condition numbers to assess numerical quality in least squares HPC applications.
Minisymposium: Linear least squares and applications. Co-organizer with Yimin Wei (Fudan University, Shanghai, P.R. China).
19th International Linear Algebra Society Conference (ILAS 2014), Seoul, Korea, Aug. 06-09, 2014.
Randomized Algorithms for Dense Linear Algebra.
Minisymposium: Randomized algorithms in parallel matrix computations. Co-organizer with Sherry Li (Lawrence Berkeley National Laboratory, USA).
SIAM Conference on Parallel Processing for Scientific Computing, Portland (OR), USA, Feb. 18-21, 2014.
Statistical estimates for the conditioning of linear least squares problems.
10th International Conference on Parallel Processing and Applied Mathematics (PPAM 2013), Warsaw, Poland, Sept. 8-11, 2013.
Computing least squares condition numbers on hybrid multicore/GPU systems.
International Conference: Applied Mathematics, Modeling and Computational Science (AMMCS 2013), Waterloo (Ontario), Canada, Aug. 26-30, 2013.
Accelerating linear system solutions using randomization.
The 18th Conference of the International Linear Algebra Society, Providence (RI), USA, June 3-7, 2013.
Fast and reliable linear system solutions on new parallel architectures.
Séminaire Aristote - Ecole Polytechnique, Palaiseau, France, May 15, 2013.
Computing least squares condition numbers.
Minisymposium: Numerical and reliability issues in high performance computing. Co-organizer with Ilse Ipsen (NC State).
SIAM Conference on Computational Science and Engineering, Boston, USA, Feb 25 - March 1, 2013.
Fast linear system solvers based on randomization techniques.
Minisymposium: Application of statistics to linear algebra algorithms. Co-organizer with Haim Avron (IBM Watson, USA).
SIAM Conference on Applied Linear Algebra, Valencia, Spain, June 18-22, 2012.
A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines.
International Conference on Computational Science, Omaha (NE), USA, June 4-6, 2012.
A parallel tiled solver for dense symmetric indefinite systems on multicore architectures.
26th IEEE International Parallel & Distributed Processing Symposium, Shanghai, China, May 21-25, 2012.
Invited plenary speaker: A parallel tiled solver for dense symmetric indefinite systems on multicore architectures.
Workshop on ''Recent developments in the solution of indefinite systems'', Eindhoven, Netherlands, Apr. 17, 2012.
Invited plenary speaker: Accelerating linear system solutions on new parallel architectures.
20th ACM High Performance Computing Symposium (HPC 2012), Orlando (FL), USA, March 26-29, 2012.
A class of fast solvers for dense linear systems on hybrid GPU-multicore machines.
SIAM Conference on Parallel Processing for Scientific Computing, Savannah (GA), USA, Feb. 15-17, 2012.
A parallel tiled solver for dense symmetric indefinite systems on multicore architectures.
The sixth workshop of the INRIA-Illinois Joint Laboratory for Petascale Computing, Urbana-Champaign (IL), USA, Nov. 21-23, 2011.
Getting fast linear system solutions on new parallel architectures.
The fifth workshop of the INRIA-Illinois Joint Laboratory for Petascale Computing, Grenoble, France, June 27-29, 2011.
Accelerating linear algebra calculations using statistical techniques.
Minisymposium: Innovative algorithms for dense linear algebra. Co-organizer with Azzam Haidar (University of Tennessee).
SIAM Conference on Computational Science and Engineering, Reno (NV), USA, Feb. 28 - March 4, 2011.
Accelerating linear algebra computations with hybrid GPU-multicore systems.
The fourth workshop of the INRIA-Illinois Joint Laboratory for Petascale Computing, Urbana-Champaign (IL), USA, Nov. 22-24, 2010.
Computational issues in least squares conditioning.
Parallel Matrix Algorithms and Applications (PMAA'10), Basel, Switzerland, June 29 - July 2, 2010.
Invited speaker: Summer school on e-science with many-core CPU/GPU processors.
Lecture on "Dense linear algebra for hybrid GPU-Multicore systems", Braga, Portugal, June 14-18, 2010.
Dense linear algebra for hybrid GPU accelerated manycore systems.
Numerical Methods in Engineering (METNUM 09), Barcelona, Spain, June 29 - July 2, 2009.
Deriving componentwise condition numbers using dual techniques.
Application to linear least squares.
SIAM Annual Meeting, San Diego, USA, July 7-11, 2008.
Computing the conditioning of the components of a linear least squares solution.
VECPAR'08, Toulouse, France, June 24-27, 2008.
Some issues in dense linear algebra for multicore.
Minisymposium: Recent developments in dense linear algebra
Organizers: Marc Baboulin and Jack Dongarra
SIAM Conference on Parallel Processing for Scientific Computing, Atlanta, USA, March 12-14, 2008.
Computing the conditioning of dense linear least squares with (Sca)LAPACK.
SIAM Conference on Parallel Processing for Scientific Computing, Atlanta, USA, March 12-14, 2008.
Very large least-squares for parameter estimation:
Algorithm and application.
SciDAC workshop on libraries and algorithms, Snowbird (Utah), USA,
July 30 - Aug. 2, 2007.
HPC tools for solving accurately the large dense linear least squares problems arising in gravity field calculations.
PARA'06, Workshop on State-of-the-Art in Scientific and Parallel Computing, Umeå, Sweden, June 18-21, 2006.
A distributed packed storage for large parallel calculations.
SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, USA, Feb. 22-24, 2006.
Parallel distributed solvers for accurate and efficient gravity field computation.
SIAM Conference on Mathematical and Computational Issues in the Geosciences, Avignon, France, June 7-10, 2005.
Solveur parallèle pour moindres carrés.
Séminaire Mécanique Orbitale, Centre National d'Etudes Spatiales, Toulouse, France, Sept. 30, 2004.
Partial condition number for linear least squares problems.
International Congress on Computational and Applied Mathematics, Katholieke Universiteit Leuven, Belgium, July 26-30, 2004.
Parallel distributed Cholesky factorization for in-core large dense problems.
SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, USA, Feb. 25-27, 2004.