research-article

Software defect prediction based on stacked sparse denoising autoencoders and enhanced extreme learning machine

Authors:

Dandan ZhuAuthors Info & Claims

IET Software, Volume 16, Issue 1

Pages 29 - 47

https://doi.org/10.1049/sfw2.12029

Published: 31 May 2021 Publication History

Abstract

Software defect prediction is an important software quality assurance technique. Nevertheless, the prediction performance of the constructed model is easily susceptible to irrelevant or redundant features in the software projects and is not predominant enough. To address these two issues, a novel defect prediction model called SSEPG based on Stacked Sparse Denoising AutoEncoders (SSDAE) and Extreme Learning Maching (ELM) optimised by Particle Swarm Optimisation (PSO) and another complementary Gravitational Search Algorithm (GSA) are proposed in this paper, which has two main merits: (1) employ a novel deep neural network – SSDAE to extract new combined features, which can effectively learn the robust deep semantic feature representation. (2) integrate strong exploitation capacity of PSO with strong exploration capability of GSA to optimise the input weights and hidden layer biases of ELM, and utilise the superior discriminability of the enhanced ELM to predict the defective modules. The SSDAE is compared with eleven state‐of‐the‐art feature extraction methods in effect and efficiency, and the SSEPG model is compared with multiple baseline models that contain five classic defect predictors and three variants across 24 software defect projects. The experimental results exhibit the superiority of the SSDAE and the SSEPG on six evaluation metrics.

References

[1]

Koru, A.G., Hongfang Liu, H.: Building defect prediction models in practice. IEEE Softw. 22(6), 23–29 (2005)

[2]

Rahman, F., Posnett, D., Devanbu, P.T.: Recalling the imprecision of cross‐project defect prediction. 20th ACM SIGSOFT Symposium on the Foundations of Software Engineering, Cary (2012)

[3]

Jiarpakdee, J., et al.: A study of redundant metrics in defect prediction datasets. IEEE International Symposium on Software Reliability Engineering Workshops, Ottawa (2016)

[4]

Khoshgoftaar, T.M., et al.: A comparative study of iterative and non‐iterative feature selection techniques for software defect prediction. Inf Syst Front. 16(5), 801–822 (2014)

[5]

Yang, X.L., et al.: Deep learning for just‐in‐time defect prediction. IEEE International Conference on Software Quality, Reliability and Security, QRS 2015, Vancouver (2015)

[6]

Kondo, M., et al.: The impact of feature reduction techniques on defect prediction models. Empir Software Eng. 24(4), 1925–1963 (2019)

[7]

Xu, Z., et al.: Software defect prediction based on kernel PCA and weighted extreme learning machine. Inf Software Technol. 106, 182–200 (2019)

[8]

Mohamed, A.R., Dahl, G.E., Hinton, G.: Acoustic modelling using deep belief networks. IEEE Trans Audio Speech Lang Process. 20(1), 14–22 (2012)

[9]

Ciresan, D.C., Meier, U., Schmidhuber, J.: Multi‐column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence. (2012)

[10]

Krizhevsky, A., Sutskever, I., Hinton, G.E: Imagenet classification with deep convolutional neural networks, In: Advances in Neural Information Processing Systems 25: 26th Annual Conference on neural information processing systems 2012. Lake Tahoe. (2012)

[11]

Guo, J., Cheng, J.H., Cleland‐Huang, J.: Semantically enhanced software traceability using deep learning techniques. In: Proceedings of the International Conference on Software Engineering. Buenos Aires (2017)

[12]

Zhu, K., et al.: Within‐project and cross‐project just‐in‐time defect prediction based on denoising autoencoder and convolutional neural network. IET Softw. 14(3), 185–195 (2020)

[13]

Xie, J.Y., XU, L.L., Chen, E.H.: Image denoising and inpainting with deep neural networks. In: Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Lake Tahoe. (2012)

[14]

Huang, G., Zhu, Q., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing. 14(1‐3), 489–501 (2006)

[15]

Huang, G., et al.: Trends in extreme learning machines: a review. Neural Network. 61, 32–48 (2015)

[16]

Rashedi, E., et al.: GSA: a gravitational search algorithm. Inf Sci. 179(13), 2232–2248 (2009)

[17]

Kennedy, J., Eberhart, R.: Particle swarm optimisation. In: Proceedings of International Conference on neural networks (ICNN’95). Perth (1995)

[18]

Garg, H.: A hybrid GSA‐GA algorithm for constrained optimisation problems. Inf Sci. 478, 499–523 (2019)

[19]

Li, B., Li, Y.B., Rong, X.W.: A hybrid optimisation algorithm for etreme learning machine. In: Proceedings of the 2015 Chinese Intelligent Automation Conference: Intelligent Technology and system. Fuzhou, Fujian (2015)

[20]

Ricky, M.Y., Purnomo, F., Yulianto, B.: Mobile application software defect prediction. In: 2016 IEEE Symposium on Service‐Oriented system engineering. SOSE 2016, Oxford (2016)

[21]

Yan, Z., Chen, X.Y., Guo, P.: Software defect prediction using fuzzy support vector regression’. In: Advances in Neural Networks ‐ ISNN 2010, 7th International Symposium on Neural Networks. ISNN 2010, Shanghai (2010)

[22]

Jin, C., Jin, S.W.: Prediction approach of software fault‐proneness based on hybrid artificial neural network and quantum particle swarm optimisation. Appl Soft Comput. 35, 717–725 (2015)

[23]

Fenton, N., et al.: Predicting software defects in varying development lifecycles using Bayesian nets. Inf Software Technol. 49(1), 32–43 (2007)

[24]

Dejaeger, K., Verbraken, T., Baesens, B.: Towards comprehensible software fault prediction models using bayesian network classifiers. IEEE Trans. Software Eng. 39(2), 237–257 (2013)

[25]

Wang, S., Yao, X.: Using class imbalance learning for software defect prediction. IEEE Trans Reliab. 62(2), 434–443 (2013)

[26]

Wang, J., Shen, B.J., Chen, Y.T.: Compressed C4.5 Models for Software Defect Prediction. 2012 12th International Conference on quality software. Xi’an (2012)

[27]

Ma, Y., et al.: Transfer learning for cross‐company software defect prediction. Inf Software Technol. 54(3), 248–256 (2012)

[28]

Ma, Y., et al.: An improved semi‐supervised learning method for software defect prediction. J Intell Fuzzy Syst. 27(5), 2473–2480 (2014)

[29]

Lu, H., Cukic, B., Culp, M.V.: An iterative semi‐supervised approach to software fault prediction. In: Proceedings of the 7th International Conference on Predictive Models in Software Engineering, PROMISE 2011. Banff (2011)

[30]

Nam, J., Kim, S.: CLAMI: Defect Prediction on Unlabelled Datasets (T)’, 30th IEEE/ACM International Conference on Automated Software Engineering. ASE 2015, Li (2015)

[31]

Elish, K.O., Elish, M.O.: Predicting defect‐prone software modules using support vector machines. J Syst Software. 81(5), 649–660 (2008)

[32]

Erturk, E., Sezer, E.A.: A comparison of some soft computing methods for software fault prediction. Expert Syst Appl. 42(4), 1872–1879 (2015)

[33]

Gao, K., et al.: Choosing software metrics for defect prediction: an investigation on feature selection techniques. Softw. Pract. Exp. 41(5), 2579–606 (2011)

[34]

Xuan, J., et al.: Towards effective bug triage with software data reduction techniques. IEEE Trans Knowl Data Eng. 27(1), 264–280 (2015)

[35]

Xu, Z., et al.: The impact of feature selection on defect prediction performance: an empirical comparison. 27th IEEE International Symposium on Software Reliability Engineering. ISSRE 2016, Ottawa (2016)

[36]

Ghotra, B., McIntosh, S., Hassan, A.E.: A large‐scale study of the impact of feature selection techniques on defect classification models. In: Proceedings of the 14th International Conference on Mining Software Repositories. MSR 2017, Buenos Aires (2017)

[37]

Kashef, S., Nezamabadi‐pour, H.: An advanced ACO algorithm for feature subset selection. Neurocomputing. 147, 271–279 (2015)

[38]

Xiong, S.H., Wang, J.Y., Lin, H.: Hybrid feature selection algorithm based on dynamic weighted ant colony algorithm. International Conference on Machine Learning and Cybernetics. ICMLC 2010, Qingdao (2010)

[39]

Chuang, L.Y., Yang, C.H., Li, J.C.: Chaotic maps based on binary particle swarm optimisation for feature selection. Appl Soft Comput. 11(1), 239–248 (2011)

[40]

Sarafrazi, S., Nezamabadi‐pour, H.: Facing the classification of binary problems with a GSA‐SVM hybrid system. Math Comput Model. 57(1‐2), 270–278 (2013)

[41]

Tantithamthavorn, C., et al.: Automated parameter optimisation of classification techniques for defect prediction models. In: Proceedings of the 38th International Conference on Software Engineering. ICSE 2016, Austin (2016)

[42]

Zhang, N., et al.: KAEA: a novel three‐stage ensemble model for software defect prediction. CMC‐Computers, Materials & Continua. 64(1), 471–499 (2020)

[43]

Peters, F., Menzies, T., Layman, L.: Lace2: Better Privacy‐Preserving Data Sharing for Cross Project Defect Prediction. 37th IEEE/ACM International Conference on Software Engineering. ICSE 2015, Florence (2015)

[44]

Ryu, D., Baik, J.: Effective multi‐objective naïve Bayes learning for cross‐project defect prediction. Appl Soft Comput. 49, 1062–1077 (2016)

[45]

Zhu, K., et al.: Software defect prediction based on non‐linear manifold learning and hybrid deep learning techniques. CMC‐Computers, Materials & Continua. 65(2), 1467–1486 (2020)

[46]

Laradji, I.H., Alshayeb, M., Ghouti, L.: Software defect prediction using ensemble learning on selected features. Inf Software Technol. 58, 388–402 (2015)

[47]

Chawla, N.V., et al.: SMOTE: synthetic minority over‐sampling technique. Jair. 16, 321–357 (2002)

[48]

Zhang, N., et al.: Software defect prediction based on stacked contractive autoencoder and multi‐objective optimisation. CMC‐Computers, Materials & Continua. 65(1), 279–308 (2020)

[49]

He, Z., et al.: An investigation on the feasibility of cross‐project defect prediction. Autom. Softw. Eng. 19(2), 167–199 (2012)

[50]

Zhu, K., et al.: Within‐project and cross‐project software defect prediction based on improved transfer naive bayes algorithm. CMC‐Computers, Materials & Continua. 63(2), 891–910 (2020)

[51]

Okutan, A., Taner Yildiz, O.: A novel kernel to predict software defectiveness. J Syst Software. 119, 109–121 (2016)

[52]

Eberhardt, J., Stote, R.H., Dejaegere, A.: Unrolr: structural analysis of protein conformations using stochastic proximity embedding. J Comput Chem. 39(30), 2551–2557 (2018)

[53]

Yang, Z.R., King, I., Xu, Z.: Heavy‐tailed symmetric stochastic neighbour embedding. In: Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Vancouver. (2009)

[54]

Globerson, A., Roweis, S.T.: Metric learning by collapsing classes. In: Advances in Neural Information Processing Systems 18 Neural Information Processing Systems. NIPS 2005, Vancouver (2005)

[55]

Thorstensen, N., Ségonne, F., Keriven, R.: Normalisation and preimage problem in Gaussian kernel PCA. In: Proceedings of the International Conference on Image Processing. ICIP 2008, San Diego (2008)

[56]

Liu, S.S., Tian, Y.T.: Facial expression recognition method based on gabor wavelet features and fractional power polynomial kernel PCA. In: Advances in Neural Networks ‐ ISNN 2010, 7th International Symposium on Neural Networks. ISNN 2010, Shanghai (2010)

[57]

Uddin, M.Z., Hassan, M.M.: A depth video‐based facial expression recognition system using radon transform, generalized discriminant analysis, and hidden markov model. Multimed Tools Appl. 74(11), 3675–3690 (2015)

[58]

Ali, M.U., et al.: Using PCA and factor analysis for dimensionality reduction of bio‐informatics data. Int J Adv Comput Sci Appl. 8(5), 415–426 (2017)

[59]

Fernández, Á., et al.: Diffusion maps for dimensionality reduction and visualization of meteorological data. Neurocomputing. 163, 25–37 (2015)

[60]

Saini, S., et al.: Human pose tracking in low‐dimensional subspace using manifold learning by charting. In: IEEE International Conference on Signal and Image Processing Applications. Melaka (2013)

[61]

Tantithamthavorn, C., et al.: An empirical comparison of model validation techniques for defect prediction models. IEEE Trans. Software Eng. 43, 1–18 (2017)

[62]

Tantithamthavorn, C., et al.: The impact of automated parameter optimisation on defect prediction models. IEEE Trans. Software Eng. 45, 683–711 (2019)

Cited By

Jin C(2023)A training sample selection method for predicting software defectsApplied Intelligence10.1007/s10489-022-04044-853:10(12015-12031)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1007/s10489-022-04044-8

Recommendations

Within‐project and cross‐project just‐in‐time defect prediction based on denoising autoencoder and convolutional neural network

Just‐in‐time defect prediction is an important and useful branch in software defect prediction. At present, deep learning is a research hotspot in the field of artificial intelligence, which can combine basic defect features into deep semantic features ...
Progress on approaches to software defect prediction

Software defect prediction is one of the most popular research topics in software engineering. It aims to predict defect‐prone software modules before defects are discovered, therefore it can be used to better prioritise software quality assurance effort. ...
Enhanced CNN for image denoising

Owing to the flexible architectures of deep convolutional neural networks (CNNs) are successfully used for image denoising. However, they suffer from the following drawbacks: (i) deep network architecture is very difficult to train. (ii) Deeper networks ...

Comments

Information & Contributors

Information

Published In

cover image IET Software

IET Software Volume 16, Issue 1

February 2022

123 pages

EISSN:1751-8814

DOI:10.1049/sfw2.v16.1

Issue’s Table of Contents

© 2021 The Authors. IET Software published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.

This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 31 May 2021

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jin C(2023)A training sample selection method for predicting software defectsApplied Intelligence10.1007/s10489-022-04044-853:10(12015-12031)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1007/s10489-022-04044-8

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents