Thanh-Nghi DO

Postdoc (09/2006-01/2008)                          

Aviz team, INRIA Futurs                            

L.R.I. - Bâtiment 490, University of Paris-Sud 

F91405 ORSAY Cedex, France                   

Tel: +33 1 69 15 64 99                                 

Email: dtnghi@lri.fr                                     

 


|EDUCATION|            |RESEARCH|            |ACTIVITY|            |SOFTWARE|       


 

Education

Dec 2004: Ph.D. in computer science on

“Visualizationand Support Vector Machine in Data Mining”

LINA, Nantes Laboratory for Computer Science

Nantes University,France

Thesis advisors: Prof. Henri Briand, Dr. François Poulet

Jul 2002: DEA in computer science on

“Visualization and Support Vector Machine in Data Mining”

LINA, Nantes Laboratory for Computer Science

Nantes University, France

Thesis advisor: Dr. François Poulet

Aug 2001: Master in computer science

IFI, Francophone Institute for Computer Science

Hanoi,Vietnam

Jul 1996: Engineering diploma in computer science

College of Information Technology

Cantho University,Vietnam

 

Distinction

-Qualificationfor Maître de Conférences function in France, Jan 2005, (N°05227153065)

 

Research Interests

- Information visualization in knowledge discovery in databases, visual data mining

- Data mining with support vector machine, kernel-based methods, decision tree

- Mining very large datasets

 

Publications

1.       T-N. Do, S. Lallich, N-K. Pham et P. Lenca. Classifying very-high-dimensional data with random forests of oblique decision trees. (to appear) in Advances in Knowledge Discovery and Management, H. Briand, F. Guillet, G. Ritschard, D. Zighed Eds, Springer-Verlag, 2009.

2.       T-N. Do, V-H. Nguyen, F. Poulet. GPU-based parallel SVM algorithm. Journal of Frontiers of Computer Science and Technology, 2009, 3(4):368-377.

3.       T-N. Do and F. Poulet. Interval Data Mining with Kernel-based Algorithms and Visualization. Chapter 5 in Mining Complex Data for Knowledge Discovery: Advances and Applications, D. A. Zighed, S. Tsumoto, Z. Ras, H. Hacid Eds, Springer-Verlag, 2009, pp. 75-91.

4.       F. Poulet and T-N. Do. Interactive Decision Tree Construction for Interval and Taxonomical data. in Visual Data Mining: Theory, Techniques and Tools for Visual Analytics, Simeon J. Simoff., Michael Boehlen, Arturas Mazeika Eds, Lecture Notes in Computer Science 4404, Springer-Verlag, 2008, pp. 123-135.

5.       F. Poulet and T-N. Do. Mining Very Large Datasets with Support Vector Machine Algorithms. Enterprise Information Systems V, O. Camp, J. Filipe, S. Hammoudi & M. Piattini Eds., Kluwer Academic Publishers, 2004, pp. 177-184.

6.       T-N. Do, S. Lallich, N-K. Pham et P. Lenca. Un nouvel algorithme de forêts aléatoires d'arbres obliques particulièrement adapté à la classification de données en grandes dimensions. Actes d'EGC2009RNTI-E-15, Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, 2009, pp. 79-90. [Acceptance rate: ~20%

7.       F. Poulet, T-N. Do, V-H. Nguyen. SVM incrémental et parallèle sur GPU. Actes d'EGC2009, RNTI-E-15, Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, 2009, pp. 103-114. [Acceptance rate: ~20%]  

8.       T-N. Do et J-D. Fekete. V4Miner pour la fouille de données. in RIA, Review of Artificial Intelligence, Vol.22/3-4: 503-517, F. Poulet & B. Legrand Eds., 2008.

9.       N-K. Pham, T-N. Do, F. Poulet et A. Morin. Tree-view pour l’exploration interactive des arbres de décision. in RIA, Review of Artificial Intelligence, Vol.22/3-4: 473-487, F. Poulet & B. Legrand Eds., 2008.

10.   T-N. Do, J-D. Fekete et F. Poulet. Algorithmes rapides de boosting de SVM. Actes d'EGC2008, RNTI-E-11, Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, 2008, pp. 297-308. [Acceptance rate: ~23%

11.   T-N. Do et F. Poulet. Classification de grands ensembles de données avec un nouvel algorithme de SVM. Actes d'EGC2007, RNTI-E-9, Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, Vol.2, pp. 739-750, 2007. [Best paper of EGC’07]

12.   T-N. Do et F. Poulet. Vis-SVM : approche coopérative en fouille de données. RNTI-E-7, Numéro Spécial Visualisation et Extraction de Connaissances, F. Poulet and P. Kuntz Eds., Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, 2006, pp. 49-74.

13.   T-N. Do, N-K. Pham et F. Poulet. Exploration interactive de résultats d'arbre de décision. Actes d'EGC2007, RNTI-E-9, Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, Vol.1, pp. 157-168, 2007. [Acceptance rate: ~33%]

14.   T-N. Do et F. Poulet. SVM incrémental, parallèle et distribué pour le traitement de grandes quantités de données. Actes d'EGC2006RNTI-E-6, Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, Vol.1, pp. 47-52, 2006.

15.   T-N. Do et F. Poulet. SVM et visualisation pour la fouille de grands ensembles de données. Actes d'EGC2005, RNTI-E-3, Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, Vol.2, pp. 545-556, 2005.[Acceptance rate: ~30%]

16.   T-N. Do et F. Poulet. Fouille de grands ensembles de données avec un boosting de proximal SVM. Actes d'EGC2004, RNTI-E-2, Revue des Nouvelles Technologies de l’Information – Série Extraction et Gestion des Connaissances, Cépaduès Editions, Vol. 1, pp. 229-240, 2004.[Acceptance rate: ~33%]

17.   T-N. Do, V-H. Nguyen and F. Poulet. A Novel SVM Algorithm for Massive Classification Tasks. in proc. of ADMA’08, The Fourth International Conference on Advanced Data Mining And Applications, Lecture Notes in Artificial Intelligence 5139, Springer-Verlag, China, 2008, pp. 147-157. [Acceptance rate: ~13%]  

18.   T-N. Do, V-H. Nguyen and F.Poulet. A Fast Parallel Support Vector Machine Algorithm for Massive Classification Tasks. in proc. of MCO’08, The second international conference on Modelling, Computation and Optimization in Information Systems and Management Sciences, CCIS 14, Springer-Verlag, France, 2008, pp. 425-434.

19.   N-K. Pham, T-N. Do, P. Lenca and S. Lallich. Using local node information in decision trees: coupling a local decision rule with an off-centered. in proc. of DMIN’08, The International Conference on Data Mining, CSREA Press, USA, 2008, pp. 117-123.

20.   P. Lenca, S. Lallich, T-N. Do, N-K. Pham. A comparison of different off-centered entropies to deal with class imbalance for decision trees. in proc. of PAKDD'2008, The Pacific-Asia Conference on Knowledge Discovery and Data Mining, Lecture Notes in Computer Science 5012, Springer-Verlag, Japan, 2008, pp. 634-643. [Acceptance rate: ~20%]

21.   T-N. Do and V-H. Nguyen. A Novel Speed-up SVM Algorithm for Massive Classification Tasks. in proc. of RIVF’08, The 6th IEEE International Conference on Computer Sciences: Research & Innovation – Vision for the Future, IEEE Press, Ho Chi Minh, Vietnam, 2008, pp. 215-220. [Acceptance rate: ~29%

22.   N. Elmqvist, T-N. Do, H. Goodell, N. Henry, J-D. Fekete. ZAME: Interactive Large-Scale Graph Visualization. in proc. of the IEEE Pacific Visualization Symposium 2008, IEEE Press, Japan, 2008, pp. 215-222. [Acceptance rate: ~22%

23.   T-N. Do and J-D. Fekete. Large Scale Classification with Support Vector Machine Algorithms. in proc. of ICMLA’07, 6th International Conference on Machine Learning and Applications, IEEE Press, Ohio, USA, 2007, pp. 7-12. [Acceptance rate: ~30%

24.   T-N. Do, F. Poulet, J-D. Fekete. Massive Data Mining via Boosting of Least Squares SVM Algorithm. in the RIVF’07, The 5th IEEE International Conference on Computer Sciences: Research & Innovation – Vision for the Future, Hanoi, Vietnam, pp. 47-52.

25.   N-K. Pham, T-N. Do, F. Poulet, A. Morin. Interactive Exploration of Decision Tree Results. in proc. of ASMDA’07, International Symposium on Applied Stochastic Models and Data Analysis 2007, Chania, Crete, Greece, 2007.

26.   T-N. Do and H-A. Le-Thi. Classifying large datasets with SVM. in proc. of 4th International Conference on Computational Management Science, CMS’07, Gevena, 2007.

27.   T-N. Do and F. Poulet. Kernel-based algorithms and visualization for interval data mining. in proc. of The Second International Workshop on Mining Complex Data - MCD'06 - In Conjunction with IEEE ICDM’06, Hong Kong, 2006, pp. 295-299.

28.   T-N. Do and F. Poulet. Classifying one billion data with a new distributed SVM algorithm. in proc. of RIVF’06, 4th IEEE International Conference on Computer Science, Research, Innovation and Vision for the Future, IEEE Press, Ho Chi Minh, Vietnam, 2006, pp. 59-66. [Acceptance rate: ~33%

29.   T-N. Do and F. Poulet. Mining Very Large datasets with SVM and Visualization. in proc. of ICEIS’05, 7th International Conference on Entreprise Information Systems: Artificial Intelligence and Decision Support Systems, Miami, USA, 2005, pp. 127-141. [Acceptance rate: ~19%]

30.   T-N. Do and F. Poulet. Kernel Methods and Visualization for Interval Data Mining. in proc. of ASMDA’05, International Symposium on Applied Stochastic Models and Data Analysis 2005, Brest, France, 2005, pp. 345-354.

31.   T-N. Do and F. Poulet. A Simple, Fast Support Vector Machine Algorithm for Data Mining. in proc. of ECML/PKDD’05 Workshop on Knowledge Discovery from Data Streams, Porto, Portugal, 2005, 87-94.

32.   T-N. Do and F. Poulet. Interval Data Mining with SVM and Visualization. in proc. of RIVF’05, 3rd International Conference on Computer Science, Research, Innovation and Vision for the Future, Cantho, Vietnam, 2005, pp. 197-203. [Best paper of Advanced Informatics session]

33.   T-N. Do and F. Poulet. Enhancing SVM with Visualization. in Discovery Science 2004, E. Suzuki et S. Arikawa Eds., Lecture Notes in Artificial Intelligence 3245, Springer-Verlag, 2004, pp. 183-194. [Acceptance rate: ~22%]

34.   T-N. Do and F. Poulet. Towards High Dimensional Data Mining with Boosting of PSVM and Visualization Tools. in proc. of ICEIS’04, 6th International Conference on Entreprise Information Systems: Artificial Intelligence and Decision Support Systems, Vol. 2, pp. 36-41, Porto, Portugal, 2004.

35.   T-N. Do and F. Poulet. Cooperation between Visualization Methods and SVM Algorithms for Data Mining. in proc. of MCO’04, Computer Sciences, Modelling, Computation and Optimization in Information Systems and Management Sciences : Data Mining Theory, Systems and Applications, H.A. Le Thi et T. Pham Dinh Eds., Hermes Science, 2004, pp. 569-576. 

36.   T-N. Do et F. Poulet. SVM incrémental pour l’analyse d’expressions de gènes. Actes de RIVF’04, 2ème Rencontre de Recherche Informatique Vietnam & Francophonie, Hanoï, Vietnam, 2004, pp. 215-220.

37.   T-N. Do and F. Poulet. Mining Very Large Datasets with Support Vector Machine Algorithms. in proc. of ICEIS’03, 5th International Conference on Enterprise Information Systems: Artificial Intelligence and Decision Support Systems, Vol. 2, pp. 140-147, Angers, France, 2003. [Acceptance rate: ~14%]

38.   T-N. Do and F. Poulet. Incremental SVM and Visualization Tools for Bio-medical Data Mining. in proc. of ECML/PKDD’03 Workshop on Data Mining and Text Mining in Bioinformatics, Cavtat-Dubrovnik, 2003, pp. 14-19.

39.   T-N. Do and F. Poulet. Interactive Visualization Tools for Visual Data-Mining. in proc. of HCP’03, 14th Mini-EURO Conference, Human Centered Processes, Luxembourg, 2003, pp. 299-303.

40.   T-N. Do et F. Poulet. Fouille de textes de l’aide de proximal SVM. Actes de RIVF’03, 1ère Rencontre de Recherche Informatique Vietnam & Francophonie, Hanoï, Vietnam, 2003, pp. 33-36.

41.   F. Poulet, B. LeGrand, T-N. Do, M-A. Aufaure. Acte de l’Atelier Visualisation et extraction de connaissances. EGC’09, 9èmes Journées d’Extraction et Gestion des Connaissances 2009.  

42.   T-B. Nguyen, P. Lenca, T-N. Do et F. Poulet. Visualisation de réseaux d'experts. Acte du 7ème Atelier Visualisation et extraction de connaissances, EGC’09, 9èmes Journées d’Extraction et Gestion des Connaissances 2009, pp. 1-5.

43.   T-N. Do, N-K. Pham et F. Poulet. Une méthode anthropocentrée pour la construction d'arbres de décision. Acte du 7ème Atelier Visualisation et extraction de connaissances, EGC’09, 9èmes Journées d’Extraction et Gestion des Connaissances 2009, pp. 33-43.

44.   T-N. Do, N-K. Pham, S. Lallich et P. Lenca. Expérimentation de l’entropie décentrée pour le traitement des classes déséquilibrées en induction par arbres. Acte du 4ème Atelier Qualité des données et des connaissances, EGC’08, 8èmes Journées d’Extraction et Gestion des Connaissances 2008, pp. 39-49.

45.   F. Poulet, B. LeGrand, T-N. Do. Acte de l’Atelier Visualisation et extraction de connaissances. EGC’08, 8èmes Journées d’Extraction et Gestion des Connaissances 2008.

46.   T-N. Do et J-D. Fekete. Fouille de données à l’aide d’un environnement de programmation visuelle. Acte du 6ème Atelier Visualisation et extraction de connaissances, EGC’08, 8èmes Journées d’Extraction et Gestion des Connaissances 2008, pp. 81-92.

47.   N-K. Pham, T-N. Do, F. Poulet et A. Morin. Exploration interactive des arbres de décision. Acte du 5ème Atelier Visualisation et extraction de connaissances, EGC’07, 7èmes Journées d’Extraction et Gestion des Connaissances 2007.

48.   T-N. Do et J-D. Fekete. Flot visuel de données. Acte du 5ème Atelier Visualisation et extraction de connaissances, EGC’07, 7èmes Journées d’Extraction et Gestion des Connaissances 2007.

49.   N-K. Pham et T-N. Do. Tree-View : post-traitement interactif pour des arbres de décision. Acte du 4ème Atelier Visualisation et extraction de connaissances, EGC’06, 6èmes Journées d’Extraction et Gestion des Connaissances 2006, pp. 103-110.

50.   T-N. Do et F. Poulet. Interprétation graphique des résultats de SVM. Actes et CD-ROM de SFDS’04, XXXVIème Journées de Statistiques, Montpellier, 2004.

51.   T-N. Do. Visualisation en extraction de connaissances à partir de données. Actes de JDOC’04, 4ème journée des Doctorants, Ecole Polytechnique de l’Université de Nantes, 2004.

52.   T-N. Do et F. Poulet. IC-PSVM : un algorithme de SVM incrémental pour la classification de données bio-informatiques. Atelier A3 : Apprentissage machine et Bioinformatique, Plateforme AFIA’03, Laval, France, 2003.

53.   F. Poulet et T-N. Do. SVM parallélisé pour classifier un milliard de données. Actes de SFC’02, IXème Rencontres de la Société Francophone de Classification, Toulouse, France, 2002, pp. 301-304.

54.   T-N. Dang, Q-B. Dang, Q-M. Nguyen, T-C. Do, V-P. Le,  and T-N. Do. A comparative study of different machine learning algorithms to deal with hand written digits recognition. in proc. of the 12th national conference in computer science, Dong Nai, Vietnam, 2009. (in vietnamese)

55.   N-K. Pham, T-N. Do, C-D. Tran. Classifying very large datasets with Arcx4-LSSVM. in proc. the National conference in computer science, HCM, 2008, pp. 72-78. [Acceptance rate: ~29%

56.   Q-N. Tran, T-N. Do, F. Poulet, N-K. Pham. Vehicle license plate classification. in proc. the National conference in computer science, HCM, 2008, pp. 79-85. [Acceptance rate: ~29%

57.   T-N. Do, V-H. Nguyen, F. Poulet, N-K. Pham. A Fast Parallel Support Vector Machine Algorithm for Massive Classification Tasks. in proc. of the 11th national conference in computer science, Hue, Vietnam, 2008.

58.   T-N. Do & N-K. Pham. Fingerprint classification. in proc. of the 11th national conference in computer science, Hue, Vietnam, 2008.

59.   T-N. Do, N-K. Pham, H-T. Do, D-L. Ngo, T-V. Nguyen. Classifying large datasets with SVM. in proc. of FAIR’07, The Third National Symposium Fundamental & Applied IT Research, Vietnam, 2007.

60.   T-N. Do, N-K. Pham, H-T. Do, N-C. Lam. Data mining with R language. in proc. of FAIR’07, The Third National Symposium Fundamental & Applied IT Research, Vietnam, 2007. (in vietnamese)

61.   N-K. Pham, N-C. Lam, T-N. Do. Linear algebra teaching with GNU Octave. in proc. of the 10th national conference in computer science, Vinh Phuc, Vietnam, 2007. (in vietnamese)

62.   T-N. Do, N-K. Pham, H-N. Pham, G-T. Pham. Towards simple, easy to understand, an interactive decision tree algorithm. in proc. of the 9th national conference in computer science, Đà Lạt, Vietnam, 2006.

63.   N-K. Pham and T-N. Do. Text categorization with boosting of PSVM. in proc. of the 9th national conference in computer science, Đà Lạt, Vietnam, 2006, pp. 269-278. (in vietnamese)

64.   T-T. Bui, D-T. Nguyen, T-N. Do. Information retrieval in e-learning. in proc. of SGK’06, Huế, 2006, pp. 1-9. (in vietnamese)

65.   H-T Do, N-K Pham, T-N Do. A simple, fast support vector machine algorithm for data mining. in proc. of FAIR’05, The Second National Symposium Fundamental & Applied IT Research, Vietnam, 2005, pp. 13-22.

Report

66.   J-D. Fekete, N. Elmqvist, T-N. Do, H. Goodell & N. Henry. Navigating Wikipedia with the Zoomable Adjacency Matrix Explorer. INRIA Research Report, Technical Report No. RR:00141168, 2007.

67.   T-N. Do et F. Poulet. La catégorisation de textes. Rapport de contrat Fondation Vediorbis, ESIEA Recherche, Laval, 2004.

Thesis

68.   T-N. Do. Visualisation et séparateurs à vaste marge en fouille de données. Thèse de Doctorat de l’Université de Nantes, Décembre 2004.

69.  T-N. Do. Visualisation et fouille de données. Rapport de DEA, Université de Nantes, Juillet 2002.

 

Professional Service

-Co-organizer:

7ème Atelier Visualisation et extraction de connaissances, EGC’09, 9èmesJournées d’Extraction et Gestion des Connaissances 2009

6ème Atelier Visualisation et extraction de connaissances, EGC’08, 8èmesJournées d’Extraction et Gestion des Connaissances 2008

 

-Program committee member:

DMIN’09, The International Conference on Data Mining, 2009

            QIMIE’09, The Quality issues, measures of interestingness and evaluation of data mining models Workshop, 2009

CIE39, The 39th Intl Conference on Computers & Industrial Engineering, 2009

DMIN’08, The International Conference on Data Mining, 2008

VIEW’06, Visual Information Expert Workshop 2006

VIEW’07, Visual Information Expert Workshop 2007

AusDM’04, The Australasian Data Mining Conference 2004

ASMDA’05, The International Symposium on Applied Stochastic Models and Data Analysis 2005

4ème Atelier Qualité des Données et des Connaissances, EGC’08, 8èmes Journéesd’Extraction et Gestion des Connaissances 2008

5ème Atelier Visualisation et extraction de connaissances, EGC’07, 7èmes Journées d’Extraction et Gestion des Connaissances 2007

4ème Atelier Visualisation et extraction de connaissances, EGC’06, 6èmes Journées d’Extraction et Gestion des Connaissances 2006

3ème Atelier Visualisation et extraction de connaissances, EGC’05, 5èmes Journées d’Extraction et Gestion des Connaissances 2005

-Reviewer:

Journal of Experimental Algorithmics 2009

Advances in Knowledge Discovery and Management  2009

Review Pattern Recognition Elsevier

Review I3, Information-Interaction–Intelligence, Cépaduès Editions, 2006

Review RNTI, Revue des Nouvelles Technologies de l'Information, Cépaduès Editions, 2006

Review RNTI, Revue des Nouvelles Technologies de l'Information, Cépaduès Editions, 2007

 

-Projects:

Fondation VediorBis on CV categorization with ESIEA Pôle ECD

claSsification & Visualisation for Exploration & Navigation (2006-2008) with EDF, INRIA, LIMSI, PARIS IX

 

Teaching

-ESIEA Laval France: Linux administration

-College of Information Technology, Cantho University: Linux administration, programming language C/C++,Java, Network programming and Web application, PostgreSQL, Databases, Data mining

-Master supervision: Thanh-Tuan Bui (Information retrieval in E-learning), School of Information Technology, Hochiminh City National University (2005-2006), Ngo Duc Luu (Classification very large datasets with SVM), School of Information Technology, Hochiminh City National University (2007-2008), Dang Quoc Bao, Dang Thi Nhung (Classifying very-high-dimensional datasets) College of Information Technology, Cantho University (2009-2010)

-Engineer supervision: NguyenQuang Can, Nguyen Thanh Cong (Documentation retrieval system), Pham Hoang Nam (Visualization in data mining), College of Information Technology, Cantho University (2005-2006), LuongHoaDang (Spam categorization), College of Information Technology, Cantho University (2007-2008)

 

Other

-Linux Professional InstituteCertificate, Level 1, 2005

 


powered by website analytics software

web analytics and stats