Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

A machine learning based model for student’s dropout prediction in online training

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

School dropout is a significant issue in distance learning, and early detection is crucial for addressing the problem. Our study aims to create a binary classification model that anticipates students’ activity levels based on their current achievements and engagement on a Canadian Distance learning Platform. Predicting student dropout, a common classification problem in educational data analysis, is addressed by utilizing a comprehensive dataset that includes 49 features ranging from socio-demographic to behavioral data. This dataset provides a unique opportunity to analyze student interactions and success factors in a distance learning environment. We have developed a student profiling system and implemented a predictive approach using XGBoost, selecting the most important features for the prediction process. In this work, our methodology was developed in Python, using the widely used sci-kit-learn package. Alongside XGBoost, logistic regression was also employed as part of our combination of strategies to enhance the models predictive capabilities. Our work can accurately predict student dropout, achieving an accuracy rate of approximately 82% on unseen data from the next academic year.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability Statement

The datasets generated and/or analyzed during this study are not publicly available. This is due to a confidentiality agreement and the fact that the data is hosted exclusively on ChallengeU’s server. The data is proprietary to ChallengeU, thus it is not accessible to anyone outside of the company. This study was conducted within the company’s servers using this non-public data.

References

  • Alam, R., Ahmad, N., Shahab, S., & Anjum, M. (2023) Prediction of dropout students in massive open online courses using ensemble learning: A pilot study in postcovid academic session. In: Mobile computing and sustainable informatics (pp. 549–565)

  • Alario-Hoyos, C., Estévez-Ayres, I., Pérez-Sanagustín, M., Kloos, C. D., & Fernández-Panadero, C. (2017). Understanding learners’ motivation and learning strategies in moocs. The International Review of Research in Open and Distributed Learning, 18, 119–137.

    Article  Google Scholar 

  • Alhramelah, A., & Alshahrani, H. A. (2020). Saudi graduate student acceptance of blended learning courses based upon the unified theory of acceptance and use of technology. Australian Educational Computing, 35, 1–22.

    Google Scholar 

  • Bonifro, F. D., Gabbrielli, M., Lisanti, G., & Zingaro, S. P. (2020). Student dropout prediction. Artificial Intelligence in Education, 12163, 129–140.

    Google Scholar 

  • Chen, J., Feng, J., Sun, X., Wu, N., Yang, Z., & Chen, S.-S. (2019). Mooc dropout prediction using a hybrid algorithm based on decision tree and extreme learning machine. Journal Hindawi Mathematical Problems in Engineering

  • Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

  • Issah, I., Appiah, O., Appiahene, P., & Inusah, F. (2023). A systematic review of the literature on machine learning application of determining the attributes influencing academic performance. Decision Analytics Journal, 7, 100204.

    Article  Google Scholar 

  • Kemper, L., Vorhoff, G., & Wigger, B. U. (2020). Predicting student dropout: A machine learning approach. European Journal of Higher Education, 10, 28–47.

    Article  Google Scholar 

  • King, G., & Zeng, L. (2001). Logistic regression in rare events data. Political Analysis, 9, 137–163.

    Article  Google Scholar 

  • Krüger, J. G. C., Souza Britto, A., & Barddal, J. P. (2023). An explainable machine learning approach for student dropout prediction. Expert Systems with Applications, 233, 120933.

    Article  Google Scholar 

  • Oz, H. C., Güven, Ç., & Nápoles, G. (2022). School dropout prediction and feature importance exploration in malawi using household panel data: machine learning approach. Journal of Computational Social Science, 6, 245–287.

    Google Scholar 

  • Pal, M. (2005). Random forest classifier for remote sensing classification. International Journal of Remote Sensing, 26, 217–222.

    Article  Google Scholar 

  • Pardos, Z.A., Baker, R., Pedro, M. O. S., Gowda, S. M., & Gowda, S. M. (2013). Affective states and state tests: investigating how affect throughout the school year predicts end of year learning outcomes. In: International Conference on Learning Analytics and Knowledge

  • Pereira, F. D., Oliveira, E. H. T., Cristea, A. I., Fernandes, D., Silva, L., Aguiar, G., Alamri, A., & Alshehri, M. (2019). Early dropout prediction for programming courses supported by online judges. In: International Conference on Artificial Intelligence in Education

  • Prenkaj, B., Velardi, P., Stilo, G., Distante, D., & Faralli, S. (2020). A survey of machine learning approaches for student dropout prediction in online courses. ACM Computing Surveys (CSUR), 53, 1–34.

    Article  Google Scholar 

  • scolaire. (2023). https://www.ledevoir.com/opinion/idees/753858/milieux-defavorisesplus-de-10-000-decrocheurs-scolaires-au-quebec

  • Shiao, Y. -T., Chen, C. -H., Wu, K. -F., Chen, B. -L., Chou, Y. -H., & Wu, T. -N. (2023). Reducing dropout rate through a deep learning model for sustainable education: long-term tracking of learning outcomes of an undergraduate cohort from 2018 to 2021. Smart Learning Environments, 10

  • Solís, M., Moreira, T. M. B., Gonzalez, R., Fernandez, T., & Hernandez, M. (2018). Perspectives to predict dropout in university students with machine learning. IEEE International Work Conference on Bioinspired Intelligence (IWOBI), 2018, 1–6.

    Google Scholar 

  • Wang, L., & Wang, H. (2019). Learning behavior analysis and dropout rate prediction based on moocs data. 2019 10th International Conference on Information Technology in Medicine and Education (ITME), 419–423

Download references

Funding

This work was supported by Ministry of the Economy in Canada.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meriem Zerkouk.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zerkouk, M., Mihoubi, M., Chikhaoui, B. et al. A machine learning based model for student’s dropout prediction in online training. Educ Inf Technol 29, 15793–15812 (2024). https://doi.org/10.1007/s10639-024-12500-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-024-12500-w

Keywords