Welcome to my personal website. I earned my specialist degree with honors in applied mathematics and computer science from Lomonosov Moscow State University, Russia, in 2007. In 2013, I obtained my PhD in computer science with cum laude distinction under the supervision of Professor Toon Calders and Professor Paul De Bra at Eindhoven University of Technology (TU/e), the Netherlands. My recent research focuses on AI for science, including the development of foundational models for scientific data such as molecules, proteins, and crystals.
Awards:
- IBM Corporate Technical Award for AutoAI 2021
- IBM Outstanding Technical Achievement Award for 'Toward automating the AI lifecycle with AutoAI' 2019
- IBM Outstanding Technical Achievement Award for 'z AI and Modernization for the z15' 2019
- IBM Master Inventor 2022
- My PhD thesis was nominated by the department of Mathematics and Computer Science, Eindhoven University of Technology in 2014. for the best PhD thesis award Eindhoven University of Technology (TU/e) in December 2014
- Nominated for the best paper award at SIAM Data Mining Conference SDM 2012.
Academic service:
- Conference PC member: NeurIPS (since 2021), ICLR (since 2022), AAAI ( since 2020), IJCAI (since 2020), ICML ( since 2021),
- Journal reviewer: DAMI (Springer), Information System (Elsevier)
- I am a Kaggle competition master
Selected publications:
- Thanh Lam Hoang, Marco Luca Sbodio, Marcos Martínez Galindo, Mykhaylo Zayats, Raúl Fernández-Díaz, Victor Valls, Gabriele Picco, Cesar Berrospi, Vanessa López: Knowledge Enhanced Representation Learning for Drug Discovery. AAAI 2024
- Thanh Lam Hoang, Gabriele Picco, Yufang Hou, Young-Suk Lee, Lam M. Nguyen, Dzung T. Phan, Vanessa López, Ramón Fernandez Astudillo: Ensembling Graph Predictions for AMR Parsing. NeurIPS 2021: 8495-8505
- Gabriele Picco, Marcos Martínez Galindo, Alberto Purpura, Leopold Fuchs, Vanessa López, Thanh Lam Hoang:
Zshot: An Open-source Framework for Zero-Shot Named Entity Recognition and Relation Extraction. ACL (demo) 2023: 357-368
- Young-Suk Lee, Ramón Fernandez Astudillo, Hoang Thanh Lam, Tahira Naseem, Radu Florian, Salim Roukos: Maximum Bayes Smatch Ensemble Distillation for AMR Parsing. NAACL 2022: 5379-5392
- Hoang Thanh Lam, Beat Buesser, Hong Min, Tran Ngoc Minh, Martin Wistuba, Udayan Khurana, Gregory Bramble, Theodoros Salonidis, Dakuo Wang, Horst Samulowitz: Automated Data Science for Relational Data. ICDE 2021: 2689-2692
- Hoang Thanh Lam, Fabian Moerchen, Dmitriy Fradkin, Toon Calders: Mining Compressing Sequential Patterns. SDM 2012: 319-330
- Hoang Thanh Lam, Toon Calders, Ninh Pham: Online Discovery of Top-k Similar Motifs in Time Series Data. SDM 2011: 1004-1015
- Hoang Thanh Lam, Toon Calders: Mining top-k frequent items in a data stream with flexible sliding windows. KDD 2010: 283-292
IBM products:
I am part of the following open-source initiative:
- TabularFM A framework for building and benchmark foundational models for tabular data.
- Graphene: graph ensemble learning for AMR parsing
- ZShot: a Spacy plug-in for few and zero shot for named entity recognition and classification with textual descriptions.
Mentoring activities: