Abstract
Although using machine learning for predicting which students are at risk of failing a course is indeed valuable, how can we identify which characteristics of individual students contribute to their being At-Risk? By characterising individual At-Risk students we could potentially advise on specific interventions or ways to reduce their probability of being At-Risk. We propose the use of local model-agnostic and counterfactual explanations to attack this challenge. Local model-agnostic LIME and SHAP methods were critically evaluated in this study. These methods explain why individual students are At-Risk. Based on these local explanations, counterfactual explanations were generated and provide potential ways for the individual student to reduce At-Risk probability. These methods were illustrated on two randomly selected At-Risk students. SHAP was found to be more stable than LIME and suggested for future use. Counterfactual explanations promised much but require the features to be actionable and causal for this method to work effectively. This entire process should be performed under the guidance of experienced educators to benefit the students.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The term Coloured refers to a distinct ethnic group in South Africa and is a normal classification used by the South African government (Bloom, 1967)
References
Agrawal, H., & Mavani, H. (2015). Student performance prediction using machine learning. International Journal of Engineering Research and Technology, 4(03), 111–113.
Aguiar, E., Chawla, N. V., Brockman, J., Ambrose, G. A., & Goodrich, V. (2014). Engagement vs performance: Using electronic portfolios to predict first semester engineering student retention. LAK 2014 : fourth International Conference on Learning Analytics and Knowledge: (pp. 103–112). Indianapolis.
Alvarez-Melis, D., & Jaakola, T. S. (2018). On the robustness of interpretability methods. In Proceedings of the 2018 ICML Workshop in Human Interpretability in Machine Learning.
Anuradha, C., & Velmurugan, T. (2015). A comparative analysis on the evaluation of classification algorithms in the prediction of students performance. Indian Journal of Science and Technology, 8, 1–12.
Baker, RS. (2015). Stupid tutoring systems, intelligent humans. International Journal of Artificial Intelligence. in Educations.
Beemer, J., Spoon, K., He, L., Fan, J., & Levine, R. A. (2017). Ensemble learning for estimating individualized treatment effects in student success studies. International Journal of Artificial Intelligence. in Education.
Bibault, J. -E., & Xing, L. (2020). Predicting survival in prostate cancer patients with interpretable artificial intelligence. The Lancet.
Binns, R. (2018). Fairness in machine learning: lessons from political philosophy. Proceedings of Machine Learning Research, 81.
Bloom, L. (1967). The coloured people of south africa. Phylon (1960-), 28(2), 139–150.
Chiao, V. (2019). Fairness, accountability and transparency: notes on algorithmic decision-making in criminal justice. International Journal of Law in Context, 15.
Conati, C., Porayska-Pomsta, K., & Mavrikis, M. (2018). AI in education needs interpretable machine learning: Lessons from open learner modelling 2018. ICML Workshop on Human Interpretability in Machine Learning (WHI.
Ghosh, A., & Kandasamy, D. (2020). Interpretable artificial intelligence: Why and when. American Journal of Roentgenology, 214.
Gilpin, L. H., Testart, C., Fruchter, N., & Adebayo, J. (2019). Explaining Explanations to Society. arXiv:1901.06560.
Green, B., & Chen, Y. (2019). Disparate interactions: An algorithm-in the-loop analysis of fairness in risk assessments. FAT* ’19: Conference on Fairness, Accountability, and Transparency (FAT* ’19) (p. 10) Atlanta:ACM. https://doi.org/10.1145/3287560.3287563.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. Springer.
Kotsiantis, S., Patriarcheas, K., & Xenos, M. (2010). A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowledge-Based Systems, 23, 529–535.
Kotsiantis, S., Pierrakeas, C., & Pintelas, P. (2004). Predicting students’ performance in distance learning using machine learning techniques. Applied Artificial Intelligence, 18(5).
Lakkaraju, H., Aguiar, E., Shan, C., Miller, D., Bhanpuri, N., Ghani, R., & Addison, K. L. (2015). A machine learning framework to identify students at risk of adverse academic outcomes. KDD’15.
Lee, M. S. A., & Floridi, L. (2020). Algorithmic fairness in mortgage lending: from absolute conditions to relational trade-offs. Minds and Machines.
Louppe, G. (2014). Understanding random forests: From theory to practice. PhD thesis, U. of Liege.
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. 31st Conference on Neural Information Processing Systems (NIPS 2017).
Molnar, C. (2019). Interpretable machine learning: A guide for making black box models explainable. Leanpub.
Mothilal, R. K., Laet, T. D., Broos, T., & Pinxten, M. (2018). Predicting First-year engineering student success: From traditional statistics to machine learning. Proceedings of the 46th SEFI Annual Conference, 46, 322–329.
Nagrecha, S., Dillon, J. Z., & Chawla, N. V. (2017). MOOC dropout prediction: Lessons learned from making pipelines interpretable. Proceedings of the 26th International Conference on World Wide Web Companion.
Pearl, J., & Mackenzie, D. (2018). The book of why. Basic Books.
Quadri, M. N., & Kalyankar, N. V. (2010). Drop out feature of student data for academic performance using decision tree techniques 10(2), 2–5.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Model-agnostic interpretability of machine learning. arXiv:1606.05386.
Rosé, C. P., McLaughlin, E. A., Liu, R., & Koedinger, K. R. (2019). Fairness, accountability and transparency: notes on algorithmic decision-making in criminal justice. British Journal of Educational Technology, 50.
Shahiri, A. M., Husain, W., & Rashid, N. A. (2015). A review on predicting student’s performance using data mining techniques. Procedia Computer Science, 72, 414–422.
Shapley, L. S. (1953). A value for n-person games. Contributions to the Theory of Games, 69–79.
Strobl, C., Boulesteix, A. -L., Zeileis, A., & Hothorn, T. (2007). Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 8.
Tian, W. (2019). Predicting and interpreting students performance using supervised learning and Shapley Additive Explanations. Master’s thesis, Arizona State University.
van den Berg, M. N., & Hofman, W. (2005). Student success in university education: a multi-measurement study of the impact of student and faculty factors on study progress. Higher Education, 50, 413–446.
Wachter, S., Mittelstadt, B., & Russel, C. (2018). Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harvard Journal of Law and Technology, 31(2), 841–887.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Smith, B.I., Chimedza, C. & Bührmann, J.H. Individualized help for at-risk students using model-agnostic and counterfactual explanations. Educ Inf Technol 27, 1539–1558 (2022). https://doi.org/10.1007/s10639-021-10661-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10639-021-10661-6