article

A survey on software fault detection based on different prediction approaches

Authors:

Golnoush Abaei,

Ali SelamatAuthors Info & Claims

Vietnam Journal of Computer Science, Volume 1, Issue 2

Pages 79 - 95

https://doi.org/10.1007/s40595-013-0008-z

Published: 01 May 2014 Publication History

Abstract

One of the software engineering interests is quality assurance activities such as testing, verification and validation, fault tolerance and fault prediction. When any company does not have sufficient budget and time for testing the entire application, a project manager can use some fault prediction algorithms to identify the parts of the system that are more defect prone. There are so many prediction approaches in the field of software engineering such as test effort, security and cost prediction. Since most of them do not have a stable model, software fault prediction has been studied in this paper based on different machine learning techniques such as decision trees, decision tables, random forest, neural network, Naïve Bayes and distinctive classifiers of artificial immune systems (AISs) such as artificial immune recognition system, CLONALG and Immunos. We use four public NASA datasets to perform our experiment. These datasets are different in size and number of defective data. Distinct parameters such as method-level metrics and two feature selection approaches which are principal component analysis and correlation based feature selection are used to evaluate the finest performance among the others. According to this study, random forest provides the best prediction performance for large data sets and Naïve Bayes is a trustable algorithm for small data sets even when one of the feature selection techniques is applied. Immunos99 performs well among AIS classifiers when feature selection technique is applied, and AIRSParallel performs better without any feature selection techniques. The performance evaluation has been done based on three different metrics such as area under receiver operating characteristic curve, probability of detection and probability of false alarm. These three evaluation metrics could give the reliable prediction criteria together.

References

[1]

Zheng, J.: Predicting software reliability with neural network ensembles. J. Expert Syst. Appl. 36, 2116---2122 (2009)

Digital Library

[2]

Dowd, M., MC Donald, J., Schuh, J.: The Art of Software Security Assessment: Identifying & Preventing Software Vulnerabilities. Addison-Wesley, Boston (2006)

Digital Library

[3]

Clark, B., Zubrow, D.: How Good is the Software: A Review of Defect Prediction Techniques. In: Software Engineering Symposium, Carreige Mellon University (2001)

[4]

Catal, C., Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179(8), 1040---1058 (2009)

Digital Library

[5]

Xie, X., Ho, J.W.K., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84, 544---558 (2011)

Digital Library

[6]

Koksal, G., Batmaz, I., Testik, M.C.: A review of data mining applications for quality improvement in manufacturing industry. J. Expert Syst. Appl. 38, 13448---13467 (2011)

Digital Library

[7]

Hewett, R.: Minig Software defect Data to Support Software testing Management. Springer Science + Business Media, LLC, Berlin (2009)

[8]

Catal, C., Diri, B.: A systematic review of software fault prediction. J. Expert Syst. Appl. 36, 7346---7354 (2009)

Digital Library

[9]

Catal, C.: Software fault prediction: a literature review and current trends. J. Expert Syst. Appl. 38, 4626---4636 (2011)

Digital Library

[10]

Evett, M., Khoshgoftaar, T., Chien, P., Allen, E.: GP-based software quality prediction. In: Proceedings of the Third Annual Genetic Programming Conference, San Francisco, CA, pp. 60---65 (1998)

[11]

Koprinska, I., Poon, J., Clark, J., Chan, J.: Learning to classify e-mail. Inf. Sci. 177(10), 2167---2187 (2007)

Digital Library

[12]

Thwin, M.M., Quah, T.: Application of neural networks for software quality prediction using object-oriented metrics. In: Proceedings of the 19th International Conference on Software Maintenance, Amsterdam, The Netherlands, pp. 113---122 (2003)

Digital Library

[13]

Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2---13 (2007)

[14]

El Emam, K., Benlarbi, S., Goel, N., Rai, S.: Comparing case-based reasoning classifiers for predicting high risk software components. J. Syst. Softw. 55(3), 301---320 (2001)

Digital Library

[15]

Yuan, X., Khoshgoftaar, T.M., Allen, E.B., Ganesan, K.: An application of fuzzy clustering to software quality prediction. In: Proceedings of the Third IEEE Symposium on Application-Specific Systems and Software Engineering Technology. IEEE Computer Society, Washington, DC (2000)

Digital Library

[16]

Catal, C., Diri, B.: Software fault prediction with object-oriented metrics based artificial immune recognition system. In: Proceedings of the 8th International Conference on Product Focused Software Process Improvement. Lecture Notes in Computer Science, pp. 300---314. Springer, Riga (2007)

Digital Library

[17]

Catal, C., Diri, B.: A fault prediction model with limited fault data to improve test process. In: Proceedings of the Ninth International Conference on Product Focused Software Process Improvements. Lecture Notes in Computer Science, pp. 244---257. Springer, Rome (2008)

Digital Library

[18]

Catal, C., Diri, B.: Software defect prediction using artificial immune recognition system. In: Proceedings of the Fourth IASTED International Conference on Software Engineering, pp. 285---290. IASTED, Innsburk (2007)

Digital Library

[19]

Zhang, H., Zhang, X.: Comments on data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. (2007)

Digital Library

[20]

Menzies, T., Dekhtyar, A., Di Stefano, J., Greenwald, J.: Problems with precision: a response to comments on data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(7), 637---640 (2007)

Digital Library

[21]

Koru, G., Liu, H.: Building effective defect prediction models in practice. IEEE Softw. 22(6), 23---29 (2005)

Digital Library

[22]

Shafi, S, Hassan, S.M., Arshaq, A., Khan, M.J., Shamail, S.: Software quality prediction techniques: a comparative analysis. In: Fourth International Conference on Emerging Technologies, pp. 242---246 (2008)

[23]

Turhan, B., Bener, A.: Analysis of Naïve Bayes assumption on software fault data: an empirical study. Data Knowl. Eng. 68(2), 278---290 (2009)

Digital Library

[24]

Alsmadi, I., Najadat, H.: Evaluating the change of software fault behavior with dataset attributes based on categorical correlation. Adv. Eng. Softw. 42, 535---546 (2011)

Digital Library

[25]

Sandhu, P.S., Singh, S., Budhija, N.: Prediction of level of severity of faults in software systems using density based clustering. In: 2011 IEEE International Conference on Software and Computer Applications. IPCSIT, vol. 9 (2011)

[26]

Turhan, B., Kocak, G., Bener, A.: Data mining source code for locating software bugs; a case study in telecommunication industry. J. Expert Syst. Appl. 36, 9986---9990 (2009)

Digital Library

[27]

Brownlee, J.: Artificial immune recognition system: a review and analysis. Technical Report 1---02, Swinburne University of Technology (2005)

[28]

Watkins, A.: A Resource Limited Artificial Immune Classifier. Master's thesis, Mississippi State University (2001)

[29]

Watkins, A.: Exploiting immunological metaphors in the development of serial, parallel, and distributed learning algorithms. PhD thesis, Mississippi State University (2005)

[30]

Brownlee, J.: Clonal selection theory & CLONALG. The clonal selection classification algorithm. Technical Report 2---02, Swinburne University of Technology (2005)

[31]

Watkins, A., Timmis, J., Boggess, L.: Artificial Immune Recognition System (AIRS): An Immune-Inspired Supervised Learning Algorithm. Genetic Programming and Evolvable Machines, vol. 5, pp. 291---317 (2004)

Digital Library

[32]

http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/C/ClonalSelection.html. Retrieved 1 Nov 2013

[33]

Brownlee, J.: Immunos-81--The Misunderstood Artificial Immune System. Technical Report 3---01. Swinburne University of Technology (2005)

[34]

Langley, P.: Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, California (1994)

[35]

Khoshgoftaar, T.M., Seliya, N., Sundaresh, N.: An empirical study of predicting software faults with case-based reasoning. Softw. Qual. J. 14(2), 85---111 (2006)

Digital Library

[36]

Malhi, A.: PCA-Based feature selection scheme for machine defect classification. IEEE Trans. Instrum. Meas. 53(6) (2004)

[37]

Kohavi, R., John, G.: Wrappers for feature subset selection. Artif. Intell. Special Issue Relev. 97(1---2), 273---324 (1996)

Digital Library

[38]

Hall, M.A.: Correlation-based Feature Subset Selection for Machine Learning. PhD dissertation, Department of Computer Science, University of Waikato (1999)

[39]

http://promise.site.uottawa.ca/SERepository/datasets. Retrieved 01 Dec 2011

[40]

http://promisedata.org/?cat=5. Retrieved 01 Dec 2011

[41]

Rakitin, S.: Software Verification and Validation for Practitioners and Managers, 2nd edn. Artech House, London (2001)

Digital Library

[42]

Shepperd, M., Ince, D.: A critique of three metrics. J. Syst. Softw. 26(3), 197---210 (1994)

[43]

Fenton, N.E., Pfleeger, S.: Software Metrics: A Rigorous and Practical Approach. Int'l Thompson Press, New York (1997)

Digital Library

[44]

http://www.cs.waikato.ac.nz/ml/weka. Retrieved 01 Nov 2011

[45]

http://promisedata.org/repository/data/pc3/pc3.arff. Retrieved 01 Dec 2011

Cited By

Malhotra RChawla SSharma A(2023)An Artificial Neural Network Model based on Binary Particle Swarm Optimization for enhancing the efficiency of Software Defect PredictionProceedings of the 2023 6th International Conference on Software Engineering and Information Management10.1145/3584871.3584885(92-100)Online publication date: 31-Jan-2023
https://dl.acm.org/doi/10.1145/3584871.3584885
Zivkovic TNikolic BSimic VPamucar DBacanin N(2023)Software defects prediction by metaheuristics tuned extreme gradient boosting and analysis based on Shapley Additive ExplanationsApplied Soft Computing10.1016/j.asoc.2023.110659146:COnline publication date: 17-Oct-2023
https://dl.acm.org/doi/10.1016/j.asoc.2023.110659
Hussain SIbrahim N(2022)Empirical Investigation of role of Meta-learning approaches for the Improvement of Software Development Process via Software Fault PredictionProceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering10.1145/3530019.3531333(413-420)Online publication date: 13-Jun-2022
https://dl.acm.org/doi/10.1145/3530019.3531333
Show More Cited By

A survey on software fault detection based on different prediction approaches
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning algorithms

Recommendations

Software Fault Prediction Using FeatBoost Feature Selection Algorithm
Abstract
A critical aspect of software engineering is Software fault prediction which aims to identify and prevent errors in software systems before their release which can cause failures or issues for its users. Various techniques and tools have been ...
Practical development of an Eclipse-based software fault prediction tool using Naive Bayes algorithm

Despite the amount of effort software engineers have been putting into developing fault prediction models, software fault prediction still poses great challenges. This research using machine learning and statistical techniques has been ongoing for ...
A Decision Tree Regression based Approach for the Number of Software Faults Prediction

Software fault prediction is an important activity to make software quality assurance (SQA) process more efficient, economic and targeted. Most of earlier works related to software fault prediction have focused on classifying software modules as faulty ...

Comments

Information & Contributors

Information

Published In

cover image Vietnam Journal of Computer Science

Vietnam Journal of Computer Science Volume 1, Issue 2

May 2014

66 pages

ISSN:2196-8888

EISSN:2196-8896

Issue’s Table of Contents

Copyright © Copyright © 2014 The Author(s).

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 May 2014

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Malhotra RChawla SSharma A(2023)An Artificial Neural Network Model based on Binary Particle Swarm Optimization for enhancing the efficiency of Software Defect PredictionProceedings of the 2023 6th International Conference on Software Engineering and Information Management10.1145/3584871.3584885(92-100)Online publication date: 31-Jan-2023
https://dl.acm.org/doi/10.1145/3584871.3584885
Zivkovic TNikolic BSimic VPamucar DBacanin N(2023)Software defects prediction by metaheuristics tuned extreme gradient boosting and analysis based on Shapley Additive ExplanationsApplied Soft Computing10.1016/j.asoc.2023.110659146:COnline publication date: 17-Oct-2023
https://dl.acm.org/doi/10.1016/j.asoc.2023.110659
Hussain SIbrahim N(2022)Empirical Investigation of role of Meta-learning approaches for the Improvement of Software Development Process via Software Fault PredictionProceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering10.1145/3530019.3531333(413-420)Online publication date: 13-Jun-2022
https://dl.acm.org/doi/10.1145/3530019.3531333
Khatri YSingh S(2022)Cross project defect prediction: a comprehensive survey with its SWOT analysisInnovations in Systems and Software Engineering10.1007/s11334-020-00380-518:2(263-281)Online publication date: 1-Jun-2022
https://dl.acm.org/doi/10.1007/s11334-020-00380-5
Amara DFatnassi EBen Arfa Rabai L(2021)An Empirical Assessment and Validation of Redundancy Metrics Using Defect Density as Reliability IndicatorScientific Programming10.1155/2021/83254172021Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1155/2021/8325417
Abid AKhan MIqbal J(2021)A review on fault detection and diagnosis techniques: basics and beyondArtificial Intelligence Review10.1007/s10462-020-09934-254:5(3639-3664)Online publication date: 1-Jun-2021
https://dl.acm.org/doi/10.1007/s10462-020-09934-2
Haouari ASouici-Meslati LAtil FMeslati D(2020)Empirical comparison and evaluation of Artificial Immune Systems in inter-release software fault predictionApplied Soft Computing10.1016/j.asoc.2020.10668696:COnline publication date: 1-Nov-2020
https://dl.acm.org/doi/10.1016/j.asoc.2020.106686
(2019)Threshold-based empirical validation of object-oriented metrics on different severity levelsInternational Journal of Intelligent Engineering Informatics10.5555/3337636.33376427:2-3(231-262)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.5555/3337636.3337642
Cynthia SRipon S(2019)Predicting and Classifying Software FaultsProceedings of the 7th International Conference on Computer and Communications Management10.1145/3348445.3348453(143-147)Online publication date: 27-Jul-2019
https://dl.acm.org/doi/10.1145/3348445.3348453
Nagaraja AKumar TBayat OAljawarneh S(2018)An Extensive Survey on Intrusion Detection- Past, Present, FutureProceedings of the Fourth International Conference on Engineering & MIS 201810.1145/3234698.3234743(1-9)Online publication date: 19-Jun-2018
https://dl.acm.org/doi/10.1145/3234698.3234743
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents