Abstract
Software defect prediction is the process of identifying new defects/bugs in software modules. Software defect presents an error in a computer program, which is caused by incorrect code or incorrect programming logic. As a result, undiscovered defects lead to a poor quality software products. In recent years, software defect prediction has received a considerable amount of attention from researchers. Most of the previous defect detection algorithms are marred by low defect detection ratios. Furthermore, software defect prediction is very challenging problem due to the high imbalanced distribution, where the bug-free codes are much higher than defective ones. In this paper, the software defect prediction problem is formulated as a classification task, and then it examines the impact of several ensembles methods on the classification effectiveness. In addition, the best ensemble classifier will be selected to be trained again on an over-sampled datasets using the Synthetic Minority Over-sampling Technique (SMOTE) algorithm to tackle imbalanced distribution problem. The proposed hybrid method is evaluated using four software defects datasets. Experimental results demonstrate that the proposed method can effectively enhance the defect prediction accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Rawat, M.S., Dubey, S.K.: Software defect prediction models for quality improvement: a literature study. IJCSI Int. J. Comput. Sci. Issues 9, 288–296 (2012)
Aljarah, I., Banitaan, S., Abufardeh, S., Jin, W., Salem, S.: Selecting discriminating terms for bug assignment: a formal analysis. In: Proceedings of the 7th International Conference on Predictive Models in Software Engineering, no. 12. ACM (2011)
Zheng, J.: Predicting software reliability with neural network ensembles. Expert Syst. Appl. 36, 2116–2122 (2009)
Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng. 38, 1276–1304 (2012)
Arisholm, E., Briand, L.C., Johannessen, E.B.: A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw. 83, 2–17 (2010)
Dowd, M., McDonald, J., Schuh, J.: The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities. Pearson Education, Upper Saddle River (2006)
Abaei, G., Selamat, A.: A survey on software fault detection based on different prediction approaches. Vietnam J. Comput. Sci. 1, 79–95 (2014)
Tomar, D., Agarwal, S.: Prediction of defective software modules using class imbalance learning. Appl. Comput. Intell. Soft Comput. 2016 (2016). Article no. 6
Fenton, N.E., Neil, M.: Software metrics: roadmap. In: Proceedings of the Conference on the Future of Software Engineering, pp. 357–370. ACM (2000)
Fenton, N., Bieman, J.: Software Metrics: A Rigorous and Practical Approach. CRC Press, Boca Raton (2014)
Clark, B., Zubrow, D.: How good is the software: a review of defect prediction techniques. Sponsored by the US Department of Defense (2001)
Wang, S., Liu, T., Tan, L.: Automatically learning semantic features for defect prediction. In: Proceedings of the 38th International Conference on Software Engineering, pp. 297–308. ACM (2016)
Quah, T.S., Thwin, M.M.T.: Application of neural networks for software quality prediction using object-oriented metrics. In: Proceedings on International Conference on Software Maintenance, ICSM 2003, pp. 116–125. IEEE (2003)
Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81, 649–660 (2008)
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33, 2–13 (2007)
Evett, M., Khoshgoftar, T., Chien, P.D., Allen, E.: Gp-based software quality prediction. In: Proceedings of the Third Annual Conference Genetic Programming, pp. 60–65 (1998)
Koru, A.G., Liu, H.: Building effective defect-prediction models in practice. IEEE Softw. 22, 23–29 (2005)
Suffian, M.D.M., Ibrahim, S.: A prediction model for system testing defects using regression analysis. arXiv preprint arXiv:1401.5830 (2014)
Koprinska, I., Poon, J., Clark, J., Chan, J.: Learning to classify e-mail. Inf. Sci. 177, 2167–2187 (2007)
Yuan, X., Khoshgoftaar, T.M., Allen, E.B., Ganesan, K.: An application of fuzzy clustering to software quality prediction. In: Proceedings of 3rd IEEE Symposium on Application-Specific Systems and Software Engineering Technology, pp. 85–90. IEEE (2000)
Czibula, G., Marian, Z., Czibula, I.G.: Software defect prediction using relational association rule mining. Inf. Sci. 264, 260–278 (2014)
Catal, C., Diri, B.: Software fault prediction with object-oriented metrics based artificial immune recognition system. In: Münch, J., Abrahamsson, P. (eds.) PROFES 2007. LNCS, vol. 4589, pp. 300–314. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73460-4_27
Catal, C., Diri, B.: A fault prediction model with limited fault data to improve test process. In: Jedlitschka, A., Salo, O. (eds.) PROFES 2008. LNCS, vol. 5089, pp. 244–257. Springer, Heidelberg (2008). doi:10.1007/978-3-540-69566-0_21
Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Schapire, R.E.: Explaining AdaBoost. In: Schölkopf, B., Luo, Z., Vovk, V. (eds.) Empirical Inference, pp. 37–52. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41136-6_5
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Shepperd, M., Song, Q., Sun, Z., Mair, C.: Data quality: some comments on the nasa software defect datasets. IEEE Trans. Softw. Eng. 39, 1208–1215 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Alsawalqah, H., Faris, H., Aljarah, I., Alnemer, L., Alhindawi, N. (2017). Hybrid SMOTE-Ensemble Approach for Software Defect Prediction. In: Silhavy, R., Silhavy, P., Prokopova, Z., Senkerik, R., Kominkova Oplatkova, Z. (eds) Software Engineering Trends and Techniques in Intelligent Systems. CSOC 2017. Advances in Intelligent Systems and Computing, vol 575. Springer, Cham. https://doi.org/10.1007/978-3-319-57141-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-57141-6_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57140-9
Online ISBN: 978-3-319-57141-6
eBook Packages: EngineeringEngineering (R0)