Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

An Experience in the Evaluation of Fault Prediction

  • Conference paper
  • First Online:
Product-Focused Software Process Improvement (PROFES 2023)

Abstract

Background. ROC (Receiver Operating Characteristic) curves are widely used to represent the performance (i.e., degree of correctness) of fault proneness models. AUC, the Area Under the ROC Curve is a quite popular performance metric, which summarizes into a single number the goodness of the predictions represented by the ROC curve. Alternative techniques have been proposed for evaluating the performance represented by a ROC curve: among these are RRA (Ratio of Relevant Areas) and \(\phi \) (alias Matthews Correlation Coefficient).

Objectives. In this paper, we aim at evaluating AUC as a performance metric, also with respect to alternative proposals.

Method. We carry out an empirical study by replicating a previously published fault prediction study and measuring the performance of the obtained faultiness models using AUC, RRA, and a recently proposed way of relating a specific kind of ROC curves to \(\phi \), based on iso-\(\phi \) ROC curves, i.e., ROC curves with constant \(\phi \). We take into account prevalence, i.e., the proportion of faulty modules in the dataset that is the object of predictions.

Results. AUC appears to provide indications that are concordant with \(\phi \) for fairly balanced datasets, while it is much more optimistic than \(\phi \) for quite imbalanced datasets. RRA’s indications appear to be moderately affected by the degree of balance in a dataset. In addition, RRA appears to agree with \(\phi \).

Conclusions. Based on the collected evidence, AUC does not seem to be suitable for evaluating the performance of fault proneness models when used with imbalanced datasets. In these cases, using RRA can be a better choice. At any rate, more research is needed to generalize these conclusions.

Partly supported by Fondo di Ricerca d’Ateneo dell’Università degli Studi dell’Insubria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

  1. Arisholm, E., Briand, L.C., Fuglerud, M.: Data mining techniques for building fault proneness models in telecom java software. In: The 18th IEEE International Symposium on Software Reliability, 2007. ISSRE2007, pp. 215–224. IEEE (2007)

    Google Scholar 

  2. Beecham, S., Hall, T., Bowes, D., Gray, D., Counsell, S., Black, S.: A systematic review of fault prediction approaches used in software engineering. Technical report Lero-TR-2010-04, Lero (2010)

    Google Scholar 

  3. Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)

    Article  Google Scholar 

  4. Catal, C.: Performance evaluation metrics for software fault prediction studies. Acta Polytech. Hung. 9(4), 193–206 (2012)

    Google Scholar 

  5. Catal, C., Diri, B.: A systematic review of software fault prediction studies. Expert Syst. Appl. 36(4), 7346–7354 (2009)

    Article  Google Scholar 

  6. Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21(1), 1–13 (2020)

    Article  Google Scholar 

  7. Chicco, D., Jurman, G.: The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BioData Min. 16(1), 1–23 (2023)

    Article  Google Scholar 

  8. Cohen, J.: Statistical Power Analysis for the Behavioral Sciences Lawrence Earlbaum Associates. Routledge, New York (1988)

    Google Scholar 

  9. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). https://doi.org/10.1016/j.patrec.2005.10.010

    Article  MathSciNet  Google Scholar 

  10. Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77(1), 103–123 (2009). https://doi.org/10.1007/s10994-009-5119-5

    Article  MATH  Google Scholar 

  11. Hosmer, D.W., Jr., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression. Wiley, Hoboken (2013)

    Book  MATH  Google Scholar 

  12. Lavazza, L., Morasca, S.: Comparing \(\phi \) and the F-measure as performance metrics for software-related classifications. EMSE 27(7), 185 (2022)

    Google Scholar 

  13. Lavazza, L., Morasca, S., Rotoloni, G.: On the reliability of the area under the roc curve in empirical software engineering. In: Proceedings of the 24th International Conference on Evaluation and Assessment in Software Engineering (EASE). Association for Computing Machinery (ACM) (2023)

    Google Scholar 

  14. Matthews, B.W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta (BBA)-Protein Struct. 405(2), 442–451 (1975)

    Google Scholar 

  15. Morasca, S., Lavazza, L.: On the assessment of software defect prediction models via ROC curves. Empir. Softw. Eng. 25(5), 3977–4019 (2020)

    Article  Google Scholar 

  16. Moussa, R., Sarro, F.: On the use of evaluation measures for defect prediction studies. In: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). ACM (2022)

    Google Scholar 

  17. Shepperd, M., Song, Q., Sun, Z., Mair, C.: Data quality: some comments on the NASA software defect datasets. IEEE Trans. Software Eng. 39(9), 1208–1215 (2013)

    Article  Google Scholar 

  18. Singh, Y., Kaur, A., Malhotra, R.: Empirical validation of object-oriented metrics for predicting fault proneness models. Softw. Qual. J. 18(1), 3 (2010)

    Article  Google Scholar 

  19. Uchigaki, S., Uchida, S., Toda, K., Monden, A.: An ensemble approach of simple regression models to cross-project fault prediction. In: 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp. 476–481. IEEE (2012)

    Google Scholar 

  20. Yao, J., Shepperd, M.: Assessing software defection prediction performance: why using the Matthews correlation coefficient matters. In: Proceedings of the Evaluation and Assessment in Software Engineering, pp. 120–129 (2020)

    Google Scholar 

  21. Zhu, Q.: On the performance of Matthews correlation coefficient (MCC) for imbalanced dataset. Pattern Recogn. Lett. 136, 71–80 (2020)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luigi Lavazza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lavazza, L., Morasca, S., Rotoloni, G. (2024). An Experience in the Evaluation of Fault Prediction. In: Kadgien, R., Jedlitschka, A., Janes, A., Lenarduzzi, V., Li, X. (eds) Product-Focused Software Process Improvement. PROFES 2023. Lecture Notes in Computer Science, vol 14483. Springer, Cham. https://doi.org/10.1007/978-3-031-49266-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-49266-2_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-49265-5

  • Online ISBN: 978-3-031-49266-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics