Generative Adversarial Networks for Imputing Sparse Learning Performance

Zhang, Liang; Yeasin, Mohammed; Lin, Jionghao; Havugimana, Felix; Hu, Xiangen

doi:10.1007/978-3-031-78172-8_25

Liang Zhang^13,14,
Mohammed Yeasin^13,14,
Jionghao Lin^15,16,
Felix Havugimana¹⁴ &
…
Xiangen Hu^13,14,17

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15306))

Included in the following conference series:

International Conference on Pattern Recognition

Abstract

Learning performance data, such as correct or incorrect responses to questions in Intelligent Tutoring Systems (ITSs) is crucial for tracking and assessing the learners’ progress and mastery of knowledge. However, the issue of data sparsity, characterized by unexplored questions and missing attempts, hampers accurate assessment and the provision of tailored, personalized instruction within ITSs. This paper proposes using the Generative Adversarial Imputation Networks (GAIN) framework to impute sparse learning performance data, reconstructed into a three-dimensional (3D) tensor representation across the dimensions of learners, questions and attempts. Our customized GAIN-based method computational process imputes sparse data in a 3D tensor space, significantly enhanced by convolutional neural networks for its input and output layers. This adaptation also includes the use of a least squares loss function for optimization and aligns the shapes of the input and output with the dimensions of the questions-attempts matrices along the learners’ dimension. Through extensive experiments on six datasets from various ITSs, including AutoTutor, ASSISTments and MATHia, we demonstrate that the GAIN approach generally outperforms existing methods such as tensor factorization and other generative adversarial network (GAN) based approaches in terms of imputation accuracy. This finding enhances comprehensive learning data modeling and analytics in AI-based education.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Advances in AI and Machine Learning for Education Research

Going Deep in Diagnostic Modeling: Deep Cognitive Diagnostic Models (DeepCDMs)

Article 11 December 2023

Deep (Un)Learning: Using Neural Networks to Model Retention and Forgetting in an Adaptive Learning System

Notes

1.
AutoTutor Moodel Website: https://sites.autotutor.org/; Adult Literacy and Adult Education Website: https://adulted.autotutor.org/.
2.
ASSISTments Website: https://new.assistments.org/.
3.
MATHia Website: https://www.carnegielearning.com/solutions/math/mathia/.
4.
Assistments 2008–2009: https://pslcdatashop.web.cmu.edu/DatasetInfo?datasetId=388.
5.
Assistments 2012–2013: https://sites.google.com/site/assistmentsdata/datasets/2012–13-school-data-with-affect?authuser=0.
6.
MATHia 2019–2020: https://pslcdatashop.web.cmu.edu/Project?id=720.

References

Psathas, G., Chatzidaki, T.K., Demetriadis, S.N.: Predictive modeling of student dropout in MOOCs and self-regulated learning. Computers 12(10), 194 (2023)
Google Scholar
Baker, R.S.: Modeling and understanding students’ off-task behavior in intelligent tutoring systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1059–1068 (2007)
Google Scholar
Saarela, M.: Automatic knowledge discovery from sparse and largescale educational data: case Finland, PhD thesis. University of Jyväskylä (2017)
Google Scholar
Greer, J., Mark, M.: Evaluation methods for intelligent tutoring systems revisited. Int. J. Artif. Intell. Edu. 26(1), 387–392 (2016)
Google Scholar
Batista, G.E., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 17(5-6), 519–533 (2003)
Google Scholar
Donders, A.R., et al.: A gentle introduction to imputation of missing values. J. Clin. Epidemiol. 59(10), 1087–1091 (2006)
Google Scholar
Zhang, Z.: Missing data imputation: focusing on single imputation. Ann. Trans. Med. 4(1) (2016)
Google Scholar
Rubin, D.B.: Multiple imputations in sample surveys-a phenomenological Bayesian approach to nonresponse. In: Proceedings of the survey research methods section Of the American Statistical Association. Vol. 1, pp. 20–34 American Statistical Association Alexandria, VA, USA (1978)
Google Scholar
Rubin, D.B.: Assignment to treatment group on the basis of a covariate. In: J. Edu. Stat. 2(1), 1–26 (1977)
Google Scholar
Seaman, S.R., Bartlett, J.W., White, I.R.: Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods. In: BMC Medical Research Methodology 12, pp. 1–13 (2012)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Proce. Syste. 27 (2014)
Google Scholar
Yoon, J., Jordon, J., Schaar, M.: Gain: missing data imputation using generative adversarial nets. In: International Conference on Machine Learning. PMLR, pp. 5689–5698 (2018)
Google Scholar
Dong, W., et al.: Generative adversarial networks for imputing missing data for big data clinical research. BMC Med. Res. Methodol. 21, 1–10 (2021)
Article Google Scholar
Zhang, Y., Zhang, R., Zhao, B.: A systematic review of generative adversarial imputation network in missing data imputation. Neural Comput. Appl. 35(27), 19685–19705 (2023)
Google Scholar
Wenyang, H., Wang, T., Chu, F.: Fault feature recovery with Wasserstein generative adversarial imputation network with gradient penalty for rotating machine health monitoring under signal loss condition. IEEE Trans. Instrum. Meas. 71, 1–12 (2022)
Google Scholar
Chen, P., Lu, Y., Zheng, V.W., Pian, Y.: Prerequisite-driven deep knowledge tracing. In: 2018 IEEE international conference on data mining (ICDM), pp. 39–48. IEEE (2018)
Google Scholar
Pandey, S., Karypis, G.: A self-attentive model for knowledge tracing. In: arXiv preprint arXiv:1907.06837 (2019)
Wang, T., Ma, F., Gao, J.: Deep hierarchical knowledge tracing. In: Proceedings of the 12th International Conference on Educational Data Mining (2019)
Google Scholar
Novak, J.D., Cañas, A.J.: The theory underlying concept maps and how to construct them. Florida Inst. Human Mach. Cogn. 1(1), 1–31 (2006)
Google Scholar
Thai-Nghe, N., et al.: “Factorization techniques for predicting student performance”. In: Educational Recommender Systems and Technologies: Practices and Challenges. IGI Global, pp. 129–153 (2012)
Google Scholar
Conway, C.M., Christiansen, M.H.: Sequential learning in non-human primates. In: Trends Cognitive Sci. 5(12), 539–546 (2001)
Google Scholar
Conway, C.M.: Sequential Learning. In: Encyclopedia of the Sciences of Learning. Ed. by Norbert M. Seel. https://doi.org/10.1007/978-1-4419-1428-6_72. Boston, MA: Springer US, pp. 3047–3050. isbn: 978-1-4419-1428-6 (2012). https://doi.org/10.1007/978-1-4419-1428-6_72
Thai-Nghe, N., et al.: “Matrix and tensor factorization for predicting student performance”. In: International Conference on Computer Supported Education. Vol. 2. SciTePress, pp. 69–78 (2011)
Google Scholar
Sahebi, S., Lin, Y.R., Brusilovsky, P.: Tensor factorization for student modeling and performance prediction in unstructured domain.” In: International Educational Data Mining Society (2016)
Google Scholar
Morales-Alvarez, P., et al.: Simultaneous missing value imputation and structure learning with groups. Adv. Neural. Inf. Process. Syst. 35, 20011–20024 (2022)
Google Scholar
Boyle, A., et al.: EEDI evaluation report (2021)
Google Scholar
Ma, C., Zhang, C.: Identifiable generative models for missing not at random data imputation. Adv. Neural. Inf. Process. Syst. 34, 27645–27658 (2021)
Google Scholar
Zhang, L., et al.: “3DG: a framework for using generative AI for Handling Sparse Learner Performance Data From Intelligent Tutoring Systems”. In: arXiv preprint arXiv:2402.01746 (2024)
Graesser, A.C., et al.: “Reading comprehension lessons in AutoTutor for the center for the study of adult literacy”. In: Adaptive educational technologies for literacy instruction. Routledge, pp. 288–293 (2016)
Google Scholar
Heffernan, N.T., Heffernan, C.L.: The ASSISTments ecosystem: building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching”. Int. J. Artif. Intell. Edu. 24, 470–497 (2014)
Google Scholar
Ritter, S., et al.: Cognitive Tutor: applied research in mathematics education. Psychon. Bull. Rev. 14, 249–255 (2007)
Article Google Scholar
Pathak, D., et al.: “Context encoders: Feature learning by inpainting”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 2016
Google Scholar
LeCun, Y., et al.: Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Proce. Syst. 2 (1989)
Google Scholar
Mao, X., et al.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)
Google Scholar
Yoon, S., Sull, S.: GAMIN: generative adversarial multiple imputation network for highly missing data”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8456–8464 (2020)
Google Scholar
Yudelson, M.V., Koedinger, K.R., Gordon, G.J.: Individualized Bayesian knowledge tracing models. In: Lane, H.C., Yacef, K., Mostow, J., Pavlik, P. (eds.) AIED 2013. LNCS (LNAI), vol. 7926, pp. 171–180. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39112-5_18
Chapter Google Scholar
Gervet, T., et al.: “When is deep learning the best approach to knowledge tracing?” J. Edu. Data Mining 12(3), 31–54 (2020)
Google Scholar
Pavlik, P.I., Eglington, L.G., Harrell-Williams, L.M.: “Logistic knowledge tracing: a constrained framework for learner modeling”. IEEE Trans. Learn. Technol. 14(5), 624–639 (2021)
Google Scholar
Chen, X., et al.: “Infogan: Interpretable representation learning by information maximizing generative adversarial nets”. Adv. Neural Inf. Proce. Syst. 29 (2016)
Google Scholar
Wang, Y., et al.: PC-GAIN: pseudo-label conditional generative adversarial imputation networks for incomplete data. Neural Netw. 141, 395–403 (2021)
Article Google Scholar
Rubin, D.B.: “Inference and missing data”. Biometrika 63(3), pp. 581–592 (1976)
Google Scholar
Baker, R.S.J., Corbett, A.T., Aleven, V.: More accurate student modeling through contextual estimation of slip and guess probabilities in Bayesian knowledge tracing. In: Woolf, B.P., Aïmeur, E., Nkambou, R., Lajoie, S. (eds.) ITS 2008. LNCS, vol. 5091, pp. 406–415. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69132-7_44
Chapter Google Scholar
Corbett, A.T., Anderson, J.R.: “Knowledge tracing: Modeling the acquisition of procedural knowledge”. In: User Modeling and User-adapted Interaction vol. 4, pp. 253–278 (1994)
Google Scholar
Essa, A.: A possible future for next generation adaptive learning systems. Smart Learn. Environ. 3(1), 1–24 (2016). https://doi.org/10.1186/s40561-016-0038-y
Article Google Scholar
Thai-Nghe, N.,et al.: “Factorization Models for Forecasting Student Performance.” In: EDM. Eindhoven, pp. 11–20 (2011)
Google Scholar
Ramscar, M.: Learning and the replicability of priming effects. Curr. Opin. Psychol. 12, 80–84 (2016)
Article Google Scholar
Zhang, L., et al.: “Exploring the individual differences in multidimensional evolution of knowledge states of learners”. In: International Conference on Human-Computer Interaction. Springer, pp. 265–284 (2023)
Google Scholar
Doan, T.N., Sahebi, S.: “Rank-based tensor factorization for student performance prediction”. In: 12th International Conference on Educational Data Mining (EDM) (2019)
Google Scholar
Carroll, J.D., Chang, J.J.: “Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition”. In: Psychometrika 35(3), pp. 283–319 (1970)
Google Scholar
Harshman, R.A., et al.: “Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multi-modal factor analysis”. In: UCLA working papers in phonetics 16(1), pp. 84 (1970)
Google Scholar
Xiong, L., et al.: “Temporal collaborative filtering with bayesian probabilistic tensor factorization”. In: Proceedings of the 2010 SIAM International Conference on Data Mining. SIAM, pp. 211–222 (2010)
Google Scholar
Morise, H., Oyama, S., Kurihara, M.: Bayesian probabilistic tensor factorization for recommendation and rating aggregation with multicriteria evaluation data. Expert Syst. Appl. 131, 1–8 (2019)
Article Google Scholar
Bora, A., Price, E., Dimakis, A.G.: “AmbientGAN: Generative models from lossy measurements”. In: International conference on learning representations (2018)
Google Scholar
Spearman, C.: “The proof and measurement of association between two things” (1961)
Google Scholar
Vincent, P., et al. “Extracting and composing robust features with denoising autoencoders”. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103 (2008)
Google Scholar
Zhang, L., et al.: “Predicting learning performance with large language models: a study in adult literacy”. In: International Conference on Human-Computer Interaction. Springer, pp. 333–353 (2024)
Google Scholar
Zhang, L., et al.: “SPL: A Socratic Playground for Learning Powered by Large Language Mode”. In: arXiv preprint arXiv:2406.13919 (2024)

Download references

Acknowledgements

This paper is grateful to Prof. Philip I. Pavlik Jr. from the University of Memphis and Prof. Shaghayegh Sahebi from the University at Albany - SUNY for their invaluable assistance with tensor factorization in the early stages of this research. Additionally, we extend our thanks to Prof. Arthur C. Graesser, also from the University of Memphis, for his insightful communications that significantly enriched our understanding and inspired deeper analytical thinking.

Author information

Authors and Affiliations

Institute for Intelligent Systems, University of Memphis, Memphis, TN, 38152, USA
Liang Zhang, Mohammed Yeasin & Xiangen Hu
Department of Electrical and Computer Engineering, University of Memphis, Memphis, TN, 38152, USA
Liang Zhang, Mohammed Yeasin, Felix Havugimana & Xiangen Hu
Human-Computer Interaction Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Jionghao Lin
Centre for Learning Analytics, Monash University, Melbourne, VIC, 3800, Australia
Jionghao Lin
Department of Applied Social Sciences, Hong Kong Polytechnic University, Hong Kong, PR, China
Xiangen Hu

Authors

Liang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Yeasin
View author publications
You can also search for this author in PubMed Google Scholar
Jionghao Lin
View author publications
You can also search for this author in PubMed Google Scholar
Felix Havugimana
View author publications
You can also search for this author in PubMed Google Scholar
Xiangen Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Zhang .

Editor information

Editors and Affiliations

University of Salford, Salford, Lancashire, UK
Apostolos Antonacopoulos
Indian Institute of Technology Bombay, Mumbai, Maharashtra, India
Subhasis Chaudhuri
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
IIT Kharagpur, Kharagpur, West Bengal, India
Saumik Bhattacharya
Indian Statistical Institute Kolkata, Kolkata, West Bengal, India
Umapada Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, L., Yeasin, M., Lin, J., Havugimana, F., Hu, X. (2025). Generative Adversarial Networks for Imputing Sparse Learning Performance. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15306. Springer, Cham. https://doi.org/10.1007/978-3-031-78172-8_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-78172-8_25
Published: 03 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78171-1
Online ISBN: 978-3-031-78172-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Generative Adversarial Networks for Imputing Sparse Learning Performance

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Advances in AI and Machine Learning for Education Research

Going Deep in Diagnostic Modeling: Deep Cognitive Diagnostic Models (DeepCDMs)

Deep (Un)Learning: Using Neural Networks to Model Retention and Forgetting in an Adaptive Learning System

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Generative Adversarial Networks for Imputing Sparse Learning Performance

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Advances in AI and Machine Learning for Education Research

Going Deep in Diagnostic Modeling: Deep Cognitive Diagnostic Models (DeepCDMs)

Deep (Un)Learning: Using Neural Networks to Model Retention and Forgetting in an Adaptive Learning System

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation