Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

A survey on software fault detection based on different prediction approaches

Published: 01 May 2014 Publication History

Abstract

One of the software engineering interests is quality assurance activities such as testing, verification and validation, fault tolerance and fault prediction. When any company does not have sufficient budget and time for testing the entire application, a project manager can use some fault prediction algorithms to identify the parts of the system that are more defect prone. There are so many prediction approaches in the field of software engineering such as test effort, security and cost prediction. Since most of them do not have a stable model, software fault prediction has been studied in this paper based on different machine learning techniques such as decision trees, decision tables, random forest, neural network, Naïve Bayes and distinctive classifiers of artificial immune systems (AISs) such as artificial immune recognition system, CLONALG and Immunos. We use four public NASA datasets to perform our experiment. These datasets are different in size and number of defective data. Distinct parameters such as method-level metrics and two feature selection approaches which are principal component analysis and correlation based feature selection are used to evaluate the finest performance among the others. According to this study, random forest provides the best prediction performance for large data sets and Naïve Bayes is a trustable algorithm for small data sets even when one of the feature selection techniques is applied. Immunos99 performs well among AIS classifiers when feature selection technique is applied, and AIRSParallel performs better without any feature selection techniques. The performance evaluation has been done based on three different metrics such as area under receiver operating characteristic curve, probability of detection and probability of false alarm. These three evaluation metrics could give the reliable prediction criteria together.

References

[1]
Zheng, J.: Predicting software reliability with neural network ensembles. J. Expert Syst. Appl. 36, 2116---2122 (2009)
[2]
Dowd, M., MC Donald, J., Schuh, J.: The Art of Software Security Assessment: Identifying & Preventing Software Vulnerabilities. Addison-Wesley, Boston (2006)
[3]
Clark, B., Zubrow, D.: How Good is the Software: A Review of Defect Prediction Techniques. In: Software Engineering Symposium, Carreige Mellon University (2001)
[4]
Catal, C., Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179(8), 1040---1058 (2009)
[5]
Xie, X., Ho, J.W.K., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84, 544---558 (2011)
[6]
Koksal, G., Batmaz, I., Testik, M.C.: A review of data mining applications for quality improvement in manufacturing industry. J. Expert Syst. Appl. 38, 13448---13467 (2011)
[7]
Hewett, R.: Minig Software defect Data to Support Software testing Management. Springer Science + Business Media, LLC, Berlin (2009)
[8]
Catal, C., Diri, B.: A systematic review of software fault prediction. J. Expert Syst. Appl. 36, 7346---7354 (2009)
[9]
Catal, C.: Software fault prediction: a literature review and current trends. J. Expert Syst. Appl. 38, 4626---4636 (2011)
[10]
Evett, M., Khoshgoftaar, T., Chien, P., Allen, E.: GP-based software quality prediction. In: Proceedings of the Third Annual Genetic Programming Conference, San Francisco, CA, pp. 60---65 (1998)
[11]
Koprinska, I., Poon, J., Clark, J., Chan, J.: Learning to classify e-mail. Inf. Sci. 177(10), 2167---2187 (2007)
[12]
Thwin, M.M., Quah, T.: Application of neural networks for software quality prediction using object-oriented metrics. In: Proceedings of the 19th International Conference on Software Maintenance, Amsterdam, The Netherlands, pp. 113---122 (2003)
[13]
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2---13 (2007)
[14]
El Emam, K., Benlarbi, S., Goel, N., Rai, S.: Comparing case-based reasoning classifiers for predicting high risk software components. J. Syst. Softw. 55(3), 301---320 (2001)
[15]
Yuan, X., Khoshgoftaar, T.M., Allen, E.B., Ganesan, K.: An application of fuzzy clustering to software quality prediction. In: Proceedings of the Third IEEE Symposium on Application-Specific Systems and Software Engineering Technology. IEEE Computer Society, Washington, DC (2000)
[16]
Catal, C., Diri, B.: Software fault prediction with object-oriented metrics based artificial immune recognition system. In: Proceedings of the 8th International Conference on Product Focused Software Process Improvement. Lecture Notes in Computer Science, pp. 300---314. Springer, Riga (2007)
[17]
Catal, C., Diri, B.: A fault prediction model with limited fault data to improve test process. In: Proceedings of the Ninth International Conference on Product Focused Software Process Improvements. Lecture Notes in Computer Science, pp. 244---257. Springer, Rome (2008)
[18]
Catal, C., Diri, B.: Software defect prediction using artificial immune recognition system. In: Proceedings of the Fourth IASTED International Conference on Software Engineering, pp. 285---290. IASTED, Innsburk (2007)
[19]
Zhang, H., Zhang, X.: Comments on data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. (2007)
[20]
Menzies, T., Dekhtyar, A., Di Stefano, J., Greenwald, J.: Problems with precision: a response to comments on data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(7), 637---640 (2007)
[21]
Koru, G., Liu, H.: Building effective defect prediction models in practice. IEEE Softw. 22(6), 23---29 (2005)
[22]
Shafi, S, Hassan, S.M., Arshaq, A., Khan, M.J., Shamail, S.: Software quality prediction techniques: a comparative analysis. In: Fourth International Conference on Emerging Technologies, pp. 242---246 (2008)
[23]
Turhan, B., Bener, A.: Analysis of Naïve Bayes assumption on software fault data: an empirical study. Data Knowl. Eng. 68(2), 278---290 (2009)
[24]
Alsmadi, I., Najadat, H.: Evaluating the change of software fault behavior with dataset attributes based on categorical correlation. Adv. Eng. Softw. 42, 535---546 (2011)
[25]
Sandhu, P.S., Singh, S., Budhija, N.: Prediction of level of severity of faults in software systems using density based clustering. In: 2011 IEEE International Conference on Software and Computer Applications. IPCSIT, vol. 9 (2011)
[26]
Turhan, B., Kocak, G., Bener, A.: Data mining source code for locating software bugs; a case study in telecommunication industry. J. Expert Syst. Appl. 36, 9986---9990 (2009)
[27]
Brownlee, J.: Artificial immune recognition system: a review and analysis. Technical Report 1---02, Swinburne University of Technology (2005)
[28]
Watkins, A.: A Resource Limited Artificial Immune Classifier. Master's thesis, Mississippi State University (2001)
[29]
Watkins, A.: Exploiting immunological metaphors in the development of serial, parallel, and distributed learning algorithms. PhD thesis, Mississippi State University (2005)
[30]
Brownlee, J.: Clonal selection theory & CLONALG. The clonal selection classification algorithm. Technical Report 2---02, Swinburne University of Technology (2005)
[31]
Watkins, A., Timmis, J., Boggess, L.: Artificial Immune Recognition System (AIRS): An Immune-Inspired Supervised Learning Algorithm. Genetic Programming and Evolvable Machines, vol. 5, pp. 291---317 (2004)
[32]
http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/C/ClonalSelection.html. Retrieved 1 Nov 2013
[33]
Brownlee, J.: Immunos-81--The Misunderstood Artificial Immune System. Technical Report 3---01. Swinburne University of Technology (2005)
[34]
Langley, P.: Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, California (1994)
[35]
Khoshgoftaar, T.M., Seliya, N., Sundaresh, N.: An empirical study of predicting software faults with case-based reasoning. Softw. Qual. J. 14(2), 85---111 (2006)
[36]
Malhi, A.: PCA-Based feature selection scheme for machine defect classification. IEEE Trans. Instrum. Meas. 53(6) (2004)
[37]
Kohavi, R., John, G.: Wrappers for feature subset selection. Artif. Intell. Special Issue Relev. 97(1---2), 273---324 (1996)
[38]
Hall, M.A.: Correlation-based Feature Subset Selection for Machine Learning. PhD dissertation, Department of Computer Science, University of Waikato (1999)
[39]
http://promise.site.uottawa.ca/SERepository/datasets. Retrieved 01 Dec 2011
[40]
http://promisedata.org/?cat=5. Retrieved 01 Dec 2011
[41]
Rakitin, S.: Software Verification and Validation for Practitioners and Managers, 2nd edn. Artech House, London (2001)
[42]
Shepperd, M., Ince, D.: A critique of three metrics. J. Syst. Softw. 26(3), 197---210 (1994)
[43]
Fenton, N.E., Pfleeger, S.: Software Metrics: A Rigorous and Practical Approach. Int'l Thompson Press, New York (1997)
[44]
http://www.cs.waikato.ac.nz/ml/weka. Retrieved 01 Nov 2011
[45]
http://promisedata.org/repository/data/pc3/pc3.arff. Retrieved 01 Dec 2011

Cited By

View all
  • (2023)An Artificial Neural Network Model based on Binary Particle Swarm Optimization for enhancing the efficiency of Software Defect PredictionProceedings of the 2023 6th International Conference on Software Engineering and Information Management10.1145/3584871.3584885(92-100)Online publication date: 31-Jan-2023
  • (2023)Software defects prediction by metaheuristics tuned extreme gradient boosting and analysis based on Shapley Additive ExplanationsApplied Soft Computing10.1016/j.asoc.2023.110659146:COnline publication date: 17-Oct-2023
  • (2022)Empirical Investigation of role of Meta-learning approaches for the Improvement of Software Development Process via Software Fault PredictionProceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering10.1145/3530019.3531333(413-420)Online publication date: 13-Jun-2022
  • Show More Cited By
  1. A survey on software fault detection based on different prediction approaches

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Vietnam Journal of Computer Science
      Vietnam Journal of Computer Science  Volume 1, Issue 2
      May 2014
      66 pages
      ISSN:2196-8888
      EISSN:2196-8896
      Issue’s Table of Contents

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 01 May 2014

      Author Tags

      1. AISParallel
      2. Artificial immune system
      3. CSCA
      4. Machine learning
      5. Random forest
      6. Software fault prediction

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 17 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)An Artificial Neural Network Model based on Binary Particle Swarm Optimization for enhancing the efficiency of Software Defect PredictionProceedings of the 2023 6th International Conference on Software Engineering and Information Management10.1145/3584871.3584885(92-100)Online publication date: 31-Jan-2023
      • (2023)Software defects prediction by metaheuristics tuned extreme gradient boosting and analysis based on Shapley Additive ExplanationsApplied Soft Computing10.1016/j.asoc.2023.110659146:COnline publication date: 17-Oct-2023
      • (2022)Empirical Investigation of role of Meta-learning approaches for the Improvement of Software Development Process via Software Fault PredictionProceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering10.1145/3530019.3531333(413-420)Online publication date: 13-Jun-2022
      • (2022)Cross project defect prediction: a comprehensive survey with its SWOT analysisInnovations in Systems and Software Engineering10.1007/s11334-020-00380-518:2(263-281)Online publication date: 1-Jun-2022
      • (2021)An Empirical Assessment and Validation of Redundancy Metrics Using Defect Density as Reliability IndicatorScientific Programming10.1155/2021/83254172021Online publication date: 1-Jan-2021
      • (2021)A review on fault detection and diagnosis techniques: basics and beyondArtificial Intelligence Review10.1007/s10462-020-09934-254:5(3639-3664)Online publication date: 1-Jun-2021
      • (2020)Empirical comparison and evaluation of Artificial Immune Systems in inter-release software fault predictionApplied Soft Computing10.1016/j.asoc.2020.10668696:COnline publication date: 1-Nov-2020
      • (2019)Threshold-based empirical validation of object-oriented metrics on different severity levelsInternational Journal of Intelligent Engineering Informatics10.5555/3337636.33376427:2-3(231-262)Online publication date: 1-Jan-2019
      • (2019)Predicting and Classifying Software FaultsProceedings of the 7th International Conference on Computer and Communications Management10.1145/3348445.3348453(143-147)Online publication date: 27-Jul-2019
      • (2018)An Extensive Survey on Intrusion Detection- Past, Present, FutureProceedings of the Fourth International Conference on Engineering & MIS 201810.1145/3234698.3234743(1-9)Online publication date: 19-Jun-2018
      • Show More Cited By

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media