Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3661167.3661195acmotherconferencesArticle/Chapter ViewAbstractPublication PageseaseConference Proceedingsconference-collections
research-article

Improving classifier-based effort-aware software defect prediction by reducing ranking errors

Published: 18 June 2024 Publication History

Abstract

Context: Software defect prediction utilizes historical data to direct software quality assurance resources to potentially problematic components. Effort-aware (EA) defect prediction prioritizes more bug-like components by taking cost-effectiveness into account. In other words, it is a ranking problem, however, existing ranking strategies based on classification, give limited consideration to ranking errors. Objective: Improve the performance of classifier-based EA ranking methods by focusing on ranking errors. Method: We propose a ranking score calculation strategy called EA-Z which sets a lower bound to avoid near-zero ranking errors. We investigate four primary EA ranking strategies with 16 classification learners, and conduct the experiments for EA-Z and the other four existing strategies. Results: Experimental results from 72 data sets show EA-Z is the best ranking score calculation strategy in terms of Recall@20% and Popt when considering all 16 learners. For particular learners, imbalanced ensemble learner UBag-svm and UBst-rf achieve top performance with EA-Z. Conclusion: Our study indicates the effectiveness of reducing ranking errors for classifier-based effort-aware defect prediction. We recommend using EA-Z with imbalanced ensemble learning.

References

[1]
E. Arisholm, L. Briand, and M. Fuglerud. 2007. Data mining techniques for building fault-proneness models in telecom java software. In The 18th IEEE International Symposium on Software Reliability (ISSRE’07). IEEE, 215–224.
[2]
E. Arisholm, L. Briand, and E. Johannessen. 2010. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. Journal of Systems and Software 83, 1 (2010), 2–17.
[3]
Y. Benjamini and D. Yekutieli. 2001. The control of the false discovery rate in multiple testing under dependency. Annals of statistics (2001), 1165–1188.
[4]
G Boetticher. 2007. The PROMISE repository of empirical software engineering data. http://promisedata. org/repository (2007).
[5]
X. Chen, Y. Zhao, Q. Wang, and Z. Yuan. 2018. MULTI: Multi-objective effort-aware just-in-time software defect prediction. Information and Software Technology 93 (2018), 1–13.
[6]
M. D’Ambros, M. Lanza, and R. Robbes. 2012. Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empirical Software Engineering 17, 4–5 (2012), 531–577.
[7]
X. Du, T. Wang, L. Wang, W. Pan, C. Chai, X. Xu, B. Jiang, and J. Wang. 2022. CoreBug: Improving effort-aware bug prediction in software systems using generalized k-core decomposition in class dependency networks. Axioms 11, 5 (2022), 205.
[8]
W. Fu and T. Menzies. 2017. Revisiting unsupervised learning for defect prediction. In Proceedings of the 2017 11th joint meeting on foundations of software engineering. 72–83.
[9]
Y. Guo, M. Shepperd, and N. Li. 2018. Bridging Effort-aware Prediction and Strong Classification: A Just-in-time Software Defect Prediction Study. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (Gothenburg, Sweden) (ICSE ’18). ACM, New York, NY, USA, 325–326. https://doi.org/10.1145/3183440.3194992
[10]
T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. 2012. A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering 38, 6 (2012), 1276–1304.
[11]
M. Hamill and K. Goseva-Popstojanova. 2017. Analyzing and predicting effort associated with finding and fixing software faults. Information and Software Technology 87 (2017), 1–18.
[12]
Q. Huang, X. Xia, and D. Lo. 2017. Supervised vs unsupervised models: A holistic look at effort-aware just-in-time defect prediction. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 159–170.
[13]
Q. Huang, X. Xia, and D. Lo. 2019. Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction. Empirical Software Engineering 24 (2019), 2823–2862.
[14]
Y. Kamei, S. Matsumoto, A. Monden, K. Matsumoto, B. Adams, and A. Hassan. 2010. Revisiting common bug prediction findings using effort-aware models. In IEEE International Conference on Software Maintenance (ICSM2010). IEEE, 1–10.
[15]
Y. Kamei, E. Shihab, B. Adams, A. Hassan, A. Mockus, A. Sinha, and N. Ubayashi. 2013. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering 39, 6 (2013), 757–773.
[16]
Y. Khatri and S. Singh. 2023. An effective feature selection based cross-project defect prediction model for software quality improvement. International Journal of System Assurance Engineering and Management (2023), 1–19.
[17]
S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. 2008. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering 34, 4 (2008), 485–496.
[18]
F. Li, P. Yang, J. Keung, Wenhua Hu, H. Luo, and X. Yu. 2023. Revisiting ‘revisiting supervised methods for effort-aware cross-project defect prediction’. IET Software 17, 4 (2023), 472–495.
[19]
W. Li, W. Zhang, X. Jia, and Z. Huang. 2020. Effort-aware semi-supervised just-in-time defect prediction. Information and Software Technology 126 (2020), 106364.
[20]
J. Liu, Y. Zhou, Y. Yang, H. Lu, and B. Xu. 2017. Code churn: A neglected metric in effort-aware just-in-time defect prediction. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, 11–19.
[21]
S. Liu, Z. Guo, Y. Li, C. Wang, L. Chen, Z. Sun, Y. Zhou, and B. Xu. 2022. Inconsistent defect labels: Essence, causes, and influence. IEEE Transactions on Software Engineering 49, 2 (2022), 586–610.
[22]
R. Malhotra. 2015. A systematic review of machine learning techniques for software fault prediction. Applied Soft Computing 27 (2015), 504–518.
[23]
T. Mende and R. Koschke. 2009. Revisiting the evaluation of defect prediction models. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE 2009). ACM.
[24]
T. Mende and R. Koschke. 2010. Effort-aware defect prediction models. In 14th European Conference on Software Maintenance and Re-engineering (CSMR 2010). IEEE, 107–116.
[25]
C. Ni, X. Xia, D. Lo, X. Chen, and Q. Gu. 2020. Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction. IEEE Transactions on Software Engineering (2020).
[26]
C. Ni, X. Xia, D. Lo, X. Yang, and A. Hassan. 2022. Just-in-time defect prediction on JavaScript projects: A replication study. ACM Transactions on Software Engineering and Methodology (TOSEM) 31, 4 (2022), 1–38.
[27]
Y. Qu, J. Chi, and H. Yin. 2021. Leveraging developer information for efficient effort-aware bug prediction. Information and Software Technology 137 (2021), 106605.
[28]
J. Rao, X. Yu, C. Zhang, J. Zhou, and JianJ.wen Xiang. 2021. Learning to rank software modules for effort-aware defect prediction. In 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C). IEEE, 372–380.
[29]
Q. Song, Y. Guo, and M. Shepperd. 2018. A comprehensive investigation of the role of imbalanced learning for software defect prediction. IEEE Transactions on Software Engineering 45, 12 (2018), 1253–1269.
[30]
C. Tantithamthavorn, S. McIntosh, A. Hassan, and K. Matsumoto. 2018. The impact of automated parameter optimization on defect prediction models. IEEE Transactions on Software Engineering 45, 7 (2018), 683–711.
[31]
M. Tomczak and E. Tomczak. 2014. The need to report effect size estimates revisited. An overview of some recommended measures of effect size. (2014).
[32]
X. Yang, D. Lo, X. Xia, and J. Sun. 2017. TLEL: A two-layer ensemble learning approach for just-in-time defect prediction. Information and Software Technology 87 (2017), 206–220.
[33]
X. Yang, D. Lo, X. Xia, Y. Zhang, and J. Sun. 2015. Deep learning for just-in-time defect prediction. In 2015 IEEE International Conference on Software Quality, Reliability and Security. IEEE, 17–26.
[34]
X. Yang, H. Yu, G. Fan, and K. Yang. 2020. A differential evolution-based approach for effort-aware just-in-time software defect prediction. In Proceedings of the 1st ACM SIGSOFT International Workshop on Representation Learning for Software Engineering and Program Languages. 13–16.
[35]
X. Yang, H. Yu, G. Fan, and K. Yang. 2021. DEJIT: a differential evolution algorithm for effort-aware just-in-time software defect prediction. International Journal of Software Engineering and Knowledge Engineering 31, 03 (2021), 289–310.
[36]
Y. Yang, Y. Zhou, J. Liu, Y. Zhao, H. Lu, L. Xu, B. Xu, and H. Leung. 2016. Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 157–168.
[37]
Y. Yang, Y. Zhou, H. Lu, L. Chen, Z. Chen, B. Xu, H. Leung, and Z. Zhang. 2014. Are slice-based cohesion metrics actually useful in effort-aware post-release fault-proneness prediction? An empirical study. IEEE Transactions on Software Engineering 41, 4 (2014), 331–357.
[38]
X. Yu, H. Dai, L. Li, X. Gu, J. Keung, K. Bennin, F. Li, and J. Liu. 2023. Finding the best learning to rank algorithms for effort-aware defect prediction. Information and Software Technology (2023), 107165.
[39]
X. Yu, L. Liu, L. Zhu, J. Keung, Z. Wang, and F. Li. 2023. A multi-objective effort-aware defect prediction approach based on NSGA-II. Applied Soft Computing 149 (2023), 110941.
[40]
X. Yu, J. Rao, L. Liu, G. Lin, W. Hu, J. Keung, J. Zhou, and J. Xiang. 2024. Improving effort-aware defect prediction by directly learning to rank software modules. Information and Software Technology 165 (2024), 107250.
[41]
Y. Zhou, Y. Yang, H. Lu, L. Chen, Y. Li, Y. Zhao, J. Qian, and B. Xu. 2018. How far we have progressed in the journey? an examination of cross-project defect prediction. ACM Transactions on Software Engineering and Methodology (TOSEM) 27, 1 (2018), 1–51.

Index Terms

  1. Improving classifier-based effort-aware software defect prediction by reducing ranking errors

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    EASE '24: Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering
    June 2024
    728 pages
    ISBN:9798400717017
    DOI:10.1145/3661167
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Effort-aware
    2. Ranking error
    3. Ranking strategy
    4. Software defect prediction

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    EASE 2024

    Acceptance Rates

    Overall Acceptance Rate 71 of 232 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 29
      Total Downloads
    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 10 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media