Abstract
One of the critical aspects in completing study in a virtual learning environment (VLE) is the student behavior when interacting with the system. However, in real cases, most of the student behavior data have imbalanced label distribution. This imbalanced dataset affects the model performance of machine learning algorithms significantly. This study attempts to examine several resampling methods such as random undersampling (RUS), oversampling with synthetic minority oversampling technique (SMOTE), and hybrid sampling (SMOTEENN) to resolve the imbalanced data issue. Several machine learning (ML) classifiers are employed to evaluate the efficiency of the resampling methods, including Naïve Bayes (NB), Logistic Regression (LR), and Random Forest (RF). The experiment results indicate that the performance of classifiers is improved utilizing more balanced dataset. Furthermore, the Random Forest classifier has accomplished the best result among all other models while using SMOTEENN as a resampling approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Crawford, J., et al.: COVID-19: 20 countries’ higher education intra-period digital pedagogy responses. 1, vol. 3, no. 1, art. no. 1 (2020). https://doi.org/10.37074/jalt.2020.3.1.7
Murphy, M.P.A.: COVID-19 and emergency eLearning: consequences of the securitization of higher education for post-pandemic pedagogy. Contemp. Secur. Policy 41(3), 492–505 (2020). https://doi.org/10.1080/13523260.2020.1761749
Dong, C., Cao, S., Li, H.: Young children’s online learning during COVID-19 pandemic: Chinese parents’ beliefs and attitudes. Child Youth Serv. Rev. 118, (2020). https://doi.org/10.1016/j.childyouth.2020.105440
bin Mat, U., Buniyamin, N., Arsad, P.M., Kassim, R.: An overview of using academic analytics to predict and improve students’ achievement: a proposed proactive intelligent intervention. In: 2013 IEEE 5th Conference on Engineering Education (ICEED), pp. 126–130 (2013). https://doi.org/10.1109/iceed.2013.6908316
Mirza, B., et al.: Efficient representation learning for high-dimensional imbalance data. In: 2016 IEEE International Conference on Digital Signal Processing (DSP), pp. 511–515 (2016). https://doi.org/10.1109/icdsp.2016.7868610
Pouyanfar, S., Chen, S.-C.: Automatic video event detection for imbalance data using enhanced ensemble deep learning. Int. J. Semant. Comput. 11(01), 85–109 (2017). https://doi.org/10.1142/S1793351X17400050
Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. Newsl. 6(1), 1–6 (2004). https://doi.org/10.1145/1007730.1007733
Ghorbani, R., Ghousi, R.: Comparing different resampling methods in predicting students’ performance using machine learning techniques. IEEE Access 8, 67899–67911 (2020). https://doi.org/10.1109/ACCESS.2020.2986809
Kaur, H., Pannu, H.S., Malhi, A.K.: A systematic review on imbalanced data challenges in machine learning: applications and solutions. ACM Comput. Surv. 52(4), 1–36 (2019). https://doi.org/10.1145/3343440
Maldonado, S., López, J., Vairetti, C.: An alternative SMOTE oversampling strategy for high-dimensional datasets. Appl. Soft Comput. 76, 380–389 (2019). https://doi.org/10.1016/j.asoc.2018.12.024
Kotsiantis, S.B., Kanellopoulos, D., Pintelas, P.E.: Data preprocessing for supervised learning. IJCS 1(1), 7 (2006)
Liu, X.-Y., Wu, J., Zhou, Z.-H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst., Man, Cybern. B 39(2), 539–550 (2009). https://doi.org/10.1109/tsmcb.2008.2007853
Yap, B.W., Rani, K.A., Rahman, H.A.A., Fong, S., Khairudin, Z., Abdullah, N.N.: An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets. In: Herawan, T., Deris, M., Abawajy, J. (eds.) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). Lecture Notes in Electrical Engineering, vol. 285. Springer, Singapore (2014). https://doi.org/10.1007/978-981-4585-18-7_2
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst., Man, Cybern. C 42(4), 463–484 (2012). https://doi.org/10.1109/tsmcc.2011.2161285
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newsl. 6(1), 20–29 (2004). https://doi.org/10.1145/1007730.1007735
Vijayvargiya, A., Prakash, C., Kumar, R., Bansal, S., Tavares, J.M.R.S.: Human knee abnormality detection from imbalanced sEMG data. Biomed. Sig. Process. and Control 66, (2021). https://doi.org/10.1016/j.bspc.2021.102406
Wang, C., Deng, C., Yu, Z., Hui, D., Gong, X., Luo, R.: Adaptive ensemble of classifiers with regularization for imbalanced data classification. Inf. Fusion 69, 81–102 (2021). https://doi.org/10.1016/j.inffus.2020.10.017
Acknowledgements
This work was also supported in part by the Ministry of Science and Technology, Taiwan, under Grant both No. MOST 109-2221-E-468-009-MY2 and No. MOST 110-2218-E-468-001-MBK. This work was also supported in part by Ministry of Education under Grant No. I109MD040. This work was also supported in part by Asia University Hospital under Grant No. 10951020.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, HC. et al. (2022). Learning Performance Prediction with Imbalanced Virtual Learning Environment Students’ Interactions Data. In: Barolli, L., Yim, K., Chen, HC. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing. IMIS 2021. Lecture Notes in Networks and Systems, vol 279. Springer, Cham. https://doi.org/10.1007/978-3-030-79728-7_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-79728-7_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79727-0
Online ISBN: 978-3-030-79728-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)