Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1370788.1370792acmconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
research-article

Comparing negative binomial and recursive partitioning models for fault prediction

Published: 12 May 2008 Publication History

Abstract

Two different software fault prediction models have been used to predict the N% of the files of a large software system that are likely to contain the largest numbers of faults. We used the same predictor variables in a negative binomial regression model and a recursive partitioning model, and compared their effectiveness on three large industrial software systems. The negative binomial model identified files that contain 76 to 93 percent of the faults, and recursive partitioning identified files that contain 68 to 85 percent.

References

[1]
E.N. Adams. Optimizing Preventive Service of Software Products. IBM J. Res. Develop., Vol 28, No 1, Jan 1984, pp. 2--14.
[2]
E. Arisholm and L.C. Briand. Predicting Fault--prone Components in a Java Legacy System. Proc. ACM/IEEE ISESE, Rio de Janeiro, 2006.
[3]
V.R. Basili and B.T. Perricone. Software Errors and Complexity: An Empirical Investigation. Communications of the ACM, Vol 27, No 1, Jan 1984, pp. 42--52.
[4]
R.M. Bell, T.J. Ostrand, and E.J. Weyuker. Looking for Bugs in All the Right Places. Proc. ACM/International Symposium on Software Testing and Analysis (ISSTA2006), Portland, Maine, July 2006, pp. 61--71.
[5]
L. Breiman. Random Forests. Machine Learning, Vol. 45, 2001, pp. 5--32.
[6]
G. Denaro and M. Pezze. An Empirical Evaluation of Fault--Proneness Models. Proc. International Conf on Software Engineering (ICSE2002), Miami, USA, May 2002.
[7]
S.G. Eick, T.L. Graves, A.F. Karr, J.S. Marron, A. Mockus. Does Code Decay? Assessing the Evidence from Change Management Data. IEEE Trans. on Software Engineering, Vol 27, No. 1, Jan 2001, pp. 1--12.
[8]
N.E. Fenton and N. Ohlsson. Quantitative Analysis of Faults and Failures in a Complex Software System. IEEE Trans. on Software Engineering, Vol 26, No 8, Aug 2000, pp. 797--814.
[9]
L. Guo, Y. Ma, B. Cukic, H. Singh. Robust Prediction of Fault--Proneness by Random Forests. Proc. ISSRE 2004, Saint--Malo, France, Nov. 2004.
[10]
L. Hatton. Reexamining the Fault Density -- Component Size Connection. IEEE Software, March/April 1997, pp. 89--97.
[11]
T.M. Khoshgoftaar, E.B. Allen, J. Deng. Using Regression Trees to Classify Fault--Prone Software Modules. IEEE Trans. on Reliability, Vol 51, No. 4, Dec 2002, pp. 455--462.
[12]
T.M. Khoshgoftaar, E.B. Allen, K.S. Kalaichelvan, N. Goel. Early Quality Prediction: A Case Study in Telecommunications. IEEE Software, Jan 1996, pp. 65--71.
[13]
A.G. Koru and H. Liu. An Investigation of the Effect of Module Size on Defect Prediction Using Static Measures. 2005 Promise Workshop, May 15, 2005.
[14]
P. McCullagh and J.A. Nelder. Generalized Linear Models, Second Edition, Chapman and Hall, London, 1989.
[15]
T. Menzies, J.S. Di Stefano, C. Cunanan, and R. Chapman. Mining Repositories to Assist in Project Planning and Resource Allocation. Innternational Workshop on Mining Software Repositories, May 2004.
[16]
K--H. Moller and D.J. Paulish. An Empirical Investigation of Software Fault Distribution. Proc. IEEE First International Software Metrics Symposium, Baltimore, Md., May 21--22, 1993, pp. 82--90.
[17]
J.C. Munson and T.M. Khoshgoftaar. The Detection of Fault--Prone Programs. IEEE Trans. on Software Engineering, Vol 18, No 5, May 1992, pp. 423--433.
[18]
T. Ostrand and E.J. Weyuker. The Distribution of Faults in a Large Industrial Software System. Proc. ACM/International Symposium on Software Testing and Analysis (ISSTA2002), Rome, Italy, July 2002, pp. 55--64.
[19]
T.J. Ostrand, E.J. Weyuker, and R.M. Bell. Predicting the Location and Number of Faults in Large Software Systems. IEEE Trans. on Software Engineering, Vol 31, No 4, April 2005.
[20]
T.J. Ostrand, E.J. Weyuker, and R.M. Bell. Automating Algorithms for the Identification of Fault--Prone Files. Proc. ACM/International Symposium on Software Testing and Analysis (ISSTA07), London, England, July 2007.
[21]
M. Pighin and A. Marzona. An Empirical Analysis of Fault Persistence Through Software Releases. Proc. IEEE/ACM ISESE 2003, pp. 206--212.
[22]
G. Succi, W. Pedrycz, M. Stefanovic, and J. Miller. Practical Assessment of the Models for Identification of Defect--prone Classes in Object-oriented Commercial Systems Using Design Metrics. Journal of Systems and Software, Vol 65, No 1, Jan 2003, pp. 1--12.
[23]
The R Project for Statistical Computing. http://www.r-project.org/
[24]
The rpart Package. http://cran.r-project.org/doc/packages/rpart.pdf

Cited By

View all
  • (2019)A Deep Introduction to AI Based Software Defect Prediction (SDP) and its Current ChallengesTENCON 2019 - 2019 IEEE Region 10 Conference (TENCON)10.1109/TENCON.2019.8929661(284-290)Online publication date: Oct-2019
  • (2016)Automated parameter optimization of classification techniques for defect prediction modelsProceedings of the 38th International Conference on Software Engineering10.1145/2884781.2884857(321-332)Online publication date: 14-May-2016
  • (2012)A Systematic Literature Review on Fault Prediction Performance in Software EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2011.10338:6(1276-1304)Online publication date: 1-Nov-2012
  • Show More Cited By

Index Terms

  1. Comparing negative binomial and recursive partitioning models for fault prediction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PROMISE '08: Proceedings of the 4th international workshop on Predictor models in software engineering
    May 2008
    108 pages
    ISBN:9781605580364
    DOI:10.1145/1370788
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 May 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. empirical study
    2. fault prediction
    3. negative binomial
    4. recursive partition
    5. software testing

    Qualifiers

    • Research-article

    Conference

    ICSE '08
    Sponsor:

    Acceptance Rates

    PROMISE '08 Paper Acceptance Rate 13 of 16 submissions, 81%;
    Overall Acceptance Rate 98 of 213 submissions, 46%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)A Deep Introduction to AI Based Software Defect Prediction (SDP) and its Current ChallengesTENCON 2019 - 2019 IEEE Region 10 Conference (TENCON)10.1109/TENCON.2019.8929661(284-290)Online publication date: Oct-2019
    • (2016)Automated parameter optimization of classification techniques for defect prediction modelsProceedings of the 38th International Conference on Software Engineering10.1145/2884781.2884857(321-332)Online publication date: 14-May-2016
    • (2012)A Systematic Literature Review on Fault Prediction Performance in Software EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2011.10338:6(1276-1304)Online publication date: 1-Nov-2012
    • (2012)Finding focused itemsets from software defect data2012 15th International Multitopic Conference (INMIC)10.1109/INMIC.2012.6511437(418-423)Online publication date: Dec-2012
    • (2012)On the use of calling structure information to improve fault predictionEmpirical Software Engineering10.1007/s10664-011-9165-917:4-5(390-423)Online publication date: 1-Aug-2012
    • (2012)A learning-to-rank algorithm for constructing defect prediction modelsProceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning10.1007/978-3-642-32639-4_21(167-175)Online publication date: 29-Aug-2012
    • (2009)Does calling structure information improve the accuracy of fault prediction?Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories10.1109/MSR.2009.5069481(61-70)Online publication date: 16-May-2009

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media