Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Generative Adversarial Networks for Imputing Sparse Learning Performance

  • Conference paper
  • First Online:
Pattern Recognition (ICPR 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15306))

Included in the following conference series:

Abstract

Learning performance data, such as correct or incorrect responses to questions in Intelligent Tutoring Systems (ITSs) is crucial for tracking and assessing the learners’ progress and mastery of knowledge. However, the issue of data sparsity, characterized by unexplored questions and missing attempts, hampers accurate assessment and the provision of tailored, personalized instruction within ITSs. This paper proposes using the Generative Adversarial Imputation Networks (GAIN) framework to impute sparse learning performance data, reconstructed into a three-dimensional (3D) tensor representation across the dimensions of learners, questions and attempts. Our customized GAIN-based method computational process imputes sparse data in a 3D tensor space, significantly enhanced by convolutional neural networks for its input and output layers. This adaptation also includes the use of a least squares loss function for optimization and aligns the shapes of the input and output with the dimensions of the questions-attempts matrices along the learners’ dimension. Through extensive experiments on six datasets from various ITSs, including AutoTutor, ASSISTments and MATHia, we demonstrate that the GAIN approach generally outperforms existing methods such as tensor factorization and other generative adversarial network (GAN) based approaches in terms of imputation accuracy. This finding enhances comprehensive learning data modeling and analytics in AI-based education.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    AutoTutor Moodel Website: https://sites.autotutor.org/; Adult Literacy and Adult Education Website: https://adulted.autotutor.org/.

  2. 2.

    ASSISTments Website: https://new.assistments.org/.

  3. 3.

    MATHia Website: https://www.carnegielearning.com/solutions/math/mathia/.

  4. 4.

    Assistments 2008–2009: https://pslcdatashop.web.cmu.edu/DatasetInfo?datasetId=388.

  5. 5.

    Assistments 2012–2013: https://sites.google.com/site/assistmentsdata/datasets/2012–13-school-data-with-affect?authuser=0.

  6. 6.

    MATHia 2019–2020: https://pslcdatashop.web.cmu.edu/Project?id=720.

References

  1. Psathas, G., Chatzidaki, T.K., Demetriadis, S.N.: Predictive modeling of student dropout in MOOCs and self-regulated learning. Computers 12(10), 194 (2023)

    Google Scholar 

  2. Baker, R.S.: Modeling and understanding students’ off-task behavior in intelligent tutoring systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1059–1068 (2007)

    Google Scholar 

  3. Saarela, M.: Automatic knowledge discovery from sparse and largescale educational data: case Finland, PhD thesis. University of Jyväskylä (2017)

    Google Scholar 

  4. Greer, J., Mark, M.: Evaluation methods for intelligent tutoring systems revisited. Int. J. Artif. Intell. Edu. 26(1), 387–392 (2016)

    Google Scholar 

  5. Batista, G.E., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 17(5-6), 519–533 (2003)

    Google Scholar 

  6. Donders, A.R., et al.: A gentle introduction to imputation of missing values. J. Clin. Epidemiol. 59(10), 1087–1091 (2006)

    Google Scholar 

  7. Zhang, Z.: Missing data imputation: focusing on single imputation. Ann. Trans. Med. 4(1) (2016)

    Google Scholar 

  8. Rubin, D.B.: Multiple imputations in sample surveys-a phenomenological Bayesian approach to nonresponse. In: Proceedings of the survey research methods section Of the American Statistical Association. Vol. 1, pp. 20–34 American Statistical Association Alexandria, VA, USA (1978)

    Google Scholar 

  9. Rubin, D.B.: Assignment to treatment group on the basis of a covariate. In: J. Edu. Stat. 2(1), 1–26 (1977)

    Google Scholar 

  10. Seaman, S.R., Bartlett, J.W., White, I.R.: Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods. In: BMC Medical Research Methodology 12, pp. 1–13 (2012)

    Google Scholar 

  11. Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Proce. Syste. 27 (2014)

    Google Scholar 

  12. Yoon, J., Jordon, J., Schaar, M.: Gain: missing data imputation using generative adversarial nets. In: International Conference on Machine Learning. PMLR, pp. 5689–5698 (2018)

    Google Scholar 

  13. Dong, W., et al.: Generative adversarial networks for imputing missing data for big data clinical research. BMC Med. Res. Methodol. 21, 1–10 (2021)

    Article  Google Scholar 

  14. Zhang, Y., Zhang, R., Zhao, B.: A systematic review of generative adversarial imputation network in missing data imputation. Neural Comput. Appl. 35(27), 19685–19705 (2023)

    Google Scholar 

  15. Wenyang, H., Wang, T., Chu, F.: Fault feature recovery with Wasserstein generative adversarial imputation network with gradient penalty for rotating machine health monitoring under signal loss condition. IEEE Trans. Instrum. Meas. 71, 1–12 (2022)

    Google Scholar 

  16. Chen, P., Lu, Y., Zheng, V.W., Pian, Y.: Prerequisite-driven deep knowledge tracing. In: 2018 IEEE international conference on data mining (ICDM), pp. 39–48. IEEE (2018)

    Google Scholar 

  17. Pandey, S., Karypis, G.: A self-attentive model for knowledge tracing. In: arXiv preprint arXiv:1907.06837 (2019)

  18. Wang, T., Ma, F., Gao, J.: Deep hierarchical knowledge tracing. In: Proceedings of the 12th International Conference on Educational Data Mining (2019)

    Google Scholar 

  19. Novak, J.D., Cañas, A.J.: The theory underlying concept maps and how to construct them. Florida Inst. Human Mach. Cogn. 1(1), 1–31 (2006)

    Google Scholar 

  20. Thai-Nghe, N., et al.: “Factorization techniques for predicting student performance”. In: Educational Recommender Systems and Technologies: Practices and Challenges. IGI Global, pp. 129–153 (2012)

    Google Scholar 

  21. Conway, C.M., Christiansen, M.H.: Sequential learning in non-human primates. In: Trends Cognitive Sci. 5(12), 539–546 (2001)

    Google Scholar 

  22. Conway, C.M.: Sequential Learning. In: Encyclopedia of the Sciences of Learning. Ed. by Norbert M. Seel. https://doi.org/10.1007/978-1-4419-1428-6_72. Boston, MA: Springer US, pp. 3047–3050. isbn: 978-1-4419-1428-6 (2012). https://doi.org/10.1007/978-1-4419-1428-6_72

  23. Thai-Nghe, N., et al.: “Matrix and tensor factorization for predicting student performance”. In: International Conference on Computer Supported Education. Vol. 2. SciTePress, pp. 69–78 (2011)

    Google Scholar 

  24. Sahebi, S., Lin, Y.R., Brusilovsky, P.: Tensor factorization for student modeling and performance prediction in unstructured domain.” In: International Educational Data Mining Society (2016)

    Google Scholar 

  25. Morales-Alvarez, P., et al.: Simultaneous missing value imputation and structure learning with groups. Adv. Neural. Inf. Process. Syst. 35, 20011–20024 (2022)

    Google Scholar 

  26. Boyle, A., et al.: EEDI evaluation report (2021)

    Google Scholar 

  27. Ma, C., Zhang, C.: Identifiable generative models for missing not at random data imputation. Adv. Neural. Inf. Process. Syst. 34, 27645–27658 (2021)

    Google Scholar 

  28. Zhang, L., et al.: “3DG: a framework for using generative AI for Handling Sparse Learner Performance Data From Intelligent Tutoring Systems”. In: arXiv preprint arXiv:2402.01746 (2024)

  29. Graesser, A.C., et al.: “Reading comprehension lessons in AutoTutor for the center for the study of adult literacy”. In: Adaptive educational technologies for literacy instruction. Routledge, pp. 288–293 (2016)

    Google Scholar 

  30. Heffernan, N.T., Heffernan, C.L.: The ASSISTments ecosystem: building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching”. Int. J. Artif. Intell. Edu. 24, 470–497 (2014)

    Google Scholar 

  31. Ritter, S., et al.: Cognitive Tutor: applied research in mathematics education. Psychon. Bull. Rev. 14, 249–255 (2007)

    Article  Google Scholar 

  32. Pathak, D., et al.: “Context encoders: Feature learning by inpainting”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 2016

    Google Scholar 

  33. LeCun, Y., et al.: Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Proce. Syst. 2 (1989)

    Google Scholar 

  34. Mao, X., et al.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)

    Google Scholar 

  35. Yoon, S., Sull, S.: GAMIN: generative adversarial multiple imputation network for highly missing data”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8456–8464 (2020)

    Google Scholar 

  36. Yudelson, M.V., Koedinger, K.R., Gordon, G.J.: Individualized Bayesian knowledge tracing models. In: Lane, H.C., Yacef, K., Mostow, J., Pavlik, P. (eds.) AIED 2013. LNCS (LNAI), vol. 7926, pp. 171–180. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39112-5_18

    Chapter  Google Scholar 

  37. Gervet, T., et al.: “When is deep learning the best approach to knowledge tracing?” J. Edu. Data Mining 12(3), 31–54 (2020)

    Google Scholar 

  38. Pavlik, P.I., Eglington, L.G., Harrell-Williams, L.M.: “Logistic knowledge tracing: a constrained framework for learner modeling”. IEEE Trans. Learn. Technol. 14(5), 624–639 (2021)

    Google Scholar 

  39. Chen, X., et al.: “Infogan: Interpretable representation learning by information maximizing generative adversarial nets”. Adv. Neural Inf. Proce. Syst. 29 (2016)

    Google Scholar 

  40. Wang, Y., et al.: PC-GAIN: pseudo-label conditional generative adversarial imputation networks for incomplete data. Neural Netw. 141, 395–403 (2021)

    Article  Google Scholar 

  41. Rubin, D.B.: “Inference and missing data”. Biometrika 63(3), pp. 581–592 (1976)

    Google Scholar 

  42. Baker, R.S.J., Corbett, A.T., Aleven, V.: More accurate student modeling through contextual estimation of slip and guess probabilities in Bayesian knowledge tracing. In: Woolf, B.P., Aïmeur, E., Nkambou, R., Lajoie, S. (eds.) ITS 2008. LNCS, vol. 5091, pp. 406–415. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69132-7_44

    Chapter  Google Scholar 

  43. Corbett, A.T., Anderson, J.R.: “Knowledge tracing: Modeling the acquisition of procedural knowledge”. In: User Modeling and User-adapted Interaction vol. 4, pp. 253–278 (1994)

    Google Scholar 

  44. Essa, A.: A possible future for next generation adaptive learning systems. Smart Learn. Environ. 3(1), 1–24 (2016). https://doi.org/10.1186/s40561-016-0038-y

    Article  Google Scholar 

  45. Thai-Nghe, N.,et al.: “Factorization Models for Forecasting Student Performance.” In: EDM. Eindhoven, pp. 11–20 (2011)

    Google Scholar 

  46. Ramscar, M.: Learning and the replicability of priming effects. Curr. Opin. Psychol. 12, 80–84 (2016)

    Article  Google Scholar 

  47. Zhang, L., et al.: “Exploring the individual differences in multidimensional evolution of knowledge states of learners”. In: International Conference on Human-Computer Interaction. Springer, pp. 265–284 (2023)

    Google Scholar 

  48. Doan, T.N., Sahebi, S.: “Rank-based tensor factorization for student performance prediction”. In: 12th International Conference on Educational Data Mining (EDM) (2019)

    Google Scholar 

  49. Carroll, J.D., Chang, J.J.: “Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition”. In: Psychometrika 35(3), pp. 283–319 (1970)

    Google Scholar 

  50. Harshman, R.A., et al.: “Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multi-modal factor analysis”. In: UCLA working papers in phonetics 16(1), pp. 84 (1970)

    Google Scholar 

  51. Xiong, L., et al.: “Temporal collaborative filtering with bayesian probabilistic tensor factorization”. In: Proceedings of the 2010 SIAM International Conference on Data Mining. SIAM, pp. 211–222 (2010)

    Google Scholar 

  52. Morise, H., Oyama, S., Kurihara, M.: Bayesian probabilistic tensor factorization for recommendation and rating aggregation with multicriteria evaluation data. Expert Syst. Appl. 131, 1–8 (2019)

    Article  Google Scholar 

  53. Bora, A., Price, E., Dimakis, A.G.: “AmbientGAN: Generative models from lossy measurements”. In: International conference on learning representations (2018)

    Google Scholar 

  54. Spearman, C.: “The proof and measurement of association between two things” (1961)

    Google Scholar 

  55. Vincent, P., et al. “Extracting and composing robust features with denoising autoencoders”. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103 (2008)

    Google Scholar 

  56. Zhang, L., et al.: “Predicting learning performance with large language models: a study in adult literacy”. In: International Conference on Human-Computer Interaction. Springer, pp. 333–353 (2024)

    Google Scholar 

  57. Zhang, L., et al.: “SPL: A Socratic Playground for Learning Powered by Large Language Mode”. In: arXiv preprint arXiv:2406.13919 (2024)

Download references

Acknowledgements

This paper is grateful to Prof. Philip I. Pavlik Jr. from the University of Memphis and Prof. Shaghayegh Sahebi from the University at Albany - SUNY for their invaluable assistance with tensor factorization in the early stages of this research. Additionally, we extend our thanks to Prof. Arthur C. Graesser, also from the University of Memphis, for his insightful communications that significantly enriched our understanding and inspired deeper analytical thinking.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, L., Yeasin, M., Lin, J., Havugimana, F., Hu, X. (2025). Generative Adversarial Networks for Imputing Sparse Learning Performance. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15306. Springer, Cham. https://doi.org/10.1007/978-3-031-78172-8_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78172-8_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78171-1

  • Online ISBN: 978-3-031-78172-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics