Data Intelligence Lab

Welcome to the Data Intelligence Lab!

Mission statement: to push the boundaries of data intelligence research and train the next leaders from KAIST

The Data Intelligence Lab is pioneering the inevitable trend of Responsible/Trustworthy/Safe AI, Data-centric AI, and Big Data – AI Integration in all of machine learning including Large Language Models (LLMs). We work closely with the industry (Google Research, Microsoft Research, NVIDIA Research, Samsung Electronics, SK Hynix, and SK Telecom). Check out our vision paper Responsible AI Challenges in End-to-end Machine Learning (IEEE Data Eng. Bull '21).

We are looking for highly-motivated Masters and PhD students. If you are interested in joining the DI Lab, please read this first. Here is a list of recommended courses, a lab fair poster designed by my students, and a short interview on conducting research with the KAIST Times (in Korean).

Logo designed by Dayun Lee and students

Latest News

[2024/9] ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models accepted to NeurIPS 2024 (Spotlight ≈ Top-3% of all submissions; Top Machine learning conference). Congrats Jio Oh, Soyeon Kim, and Junseok Seo!
[2024/8] Jaeyoung Park presented our Falcon paper at VLDB 2024 (Top Database conference), and Seong-Hyeon Hwang and Minsu Kim presented our RC-Mixup paper at ACM SIGKDD 2024 (Top Data Mining conference).
[2024/7] Our lab attended the first Data Intelligence Workshop in Korea.
[2024/6] Gave the keynote talk Towards a Holistic Framework for Data-centric Responsible AI at the Guide-AI workshop @ ACM SIGMOD 2024.
[2024/5] RC-Mixup: A Data Augmentation Strategy against Noisy Data for Regression Tasks accepted to ACM SIGKDD 2024 (Top Data Mining conference). Congrats Seong-Hyeon Hwang and Minsu Kim!
[2024/5] LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views accepted to ICML 2024 (Top Machine Learning conference). Congrats Yuji Roh!
[2024/4] The IEEE Data Engineering Bulletin 2024 March Special Issue on Data-centric Responsible AI is now available.
[2024/3] Seungjun Oh joined our lab. Welcome!
[2024/3] Promoted to Tenured Associate Professor.
[2024/2] Ki Hyun Tae and Yuji Roh are the first Ph.D. graduates from our lab employed to Samsung Research and Google, respectively. Congrats and looking forward to a very bright future!
[2023/12] Falcon: Fair Active Learning using Multi-armed Bandits accepted to VLDB 2024 (Top Database conference). Congrats Ki Hyun Tae and Jaeyoung Park!
[2023/12] Serving as an Associate Editor for VLDB 2025 (PVLDB Volume 18; Top Database conference)
[2023/12] Quilt: Robust Data Segment Selection against Concept Drifts accepted to AAAI 2024 (Top AI conference). Congrats Minsu Kim and Seong-Hyeon Hwang!
[2023/11] Serving as an Associate Editor for the IEEE TKDE journal (Top Database/Data Mining journal; currently only editor from Korea)
[2023/11] The second NYU-KAIST Inclusive AI Center workshop was held at NYU.
[2023/10] Supported by a new Google Research Award for a year in collaboration with the TensorFlow Extended (TFX) team!
[2023/8] The NYU-KAIST Inclusive AI Center workshop was held at KAIST.
[2023/6] Yuji Roh is a research intern at Google DeepMind & Youtube during the summer.
[2023/4] Improving Fair Training under Correlation Shifts accepted to ICML 2023 (Top Machine Learning conference). Congrats Yuji Roh!
[2023/4] Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on Real and Generated Data accepted to Transactions on Machine Learning Research (TMLR), a new Machine Learning journal. Congrats Yuji Roh!
[2023/3] Gave a tech talk on Responsible AI to the Google TensorFlow Extended (TFX) US and Korea teams.