Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1868328.1868336acmotherconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
research-article

Replication of defect prediction studies: problems, pitfalls and recommendations

Published: 12 September 2010 Publication History

Abstract

Background: The main goal of the PROMISE repository is to enable reproducible, and thus verifiable or refutable research. Over time, plenty of data sets became available, especially for defect prediction problems.
Aims: In this study, we investigate possible problems and pitfalls that occur during replication. This information can be used for future replication studies, and serve as a guideline for researchers reporting novel results.
Method: We replicate two recent defect prediction studies comparing different data sets and learning algorithms, and report missing information and problems.
Results: Even with access to the original data sets, replicating previous studies may not lead to the exact same results. The choice of evaluation procedures, performance measures and presentation has a large influence on the reproducibility. Additionally, we show that trivial and random models can be used to identify overly optimistic evaluation measures.
Conclusions: The best way to conduct easily reproducible studies is to share all associated artifacts, e.g. scripts and programs used. When this is not an option, our results can be used to simplify the replication task for other researchers.

References

[1]
}}E. Arisholm, L. C. Briand, and M. Fuglerud. Data mining techniques for building fault-proneness models in telecom java software. In ISSRE '07: Proceedings of the 18th IEEE International Symposium on Software Reliability Engineering, pages 215--224. IEEE Press, 2007.
[2]
}}E. Arisholm, L. C. Briand, and E. B. Johannessen. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. Technical Report TR 2008--06, Simula Research Laboratory, 2008.
[3]
}}E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, and A. Weingessel. e1071: Misc functions of the department of statistics (e1071), TU Wien, 2009. R package version 1.5--19.
[4]
}}C. Drummond. Replicability is not reproducibility: Nor is it good science. In Proceedings of the Twenty-Sixth International Conference on Machine Learning: Workshop on Evaluation Methods for Machine Learning IV, 2009.
[5]
}}M. D'Ambros, M. Lanza, and R. Robbes. An extensive comparison of bug prediction approaches. In 7th IEEE Working Conference on Mining Software Repositories, 2010.
[6]
}}K. El Emam, S. Benlarbi, N. Goel, and S. N. Rai. The confounding effect of class size on the validity of object-oriented metrics. IEEE Transactions on Software Engineering, 27(7):630--650, 2001.
[7]
}}G. Gay, T. Menzies, B. Cukic, and B. Turhan. How to build repeatable experiments. In PROMISE '09: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pages 1--9, New York, NY, USA, 2009. ACM.
[8]
}}A. E. Hassan. Predicting faults using the complexity of code changes. In International Conference on Software Engineering, pages 78--88, Washington, DC, USA, 2009. IEEE Computer Society.
[9]
}}Y. Jiang, B. Cukic, and Y. Ma. Techniques for evaluating fault prediction models. Empirical Software Engineering, 13(5):561--595, 2008.
[10]
}}B. Kitchenham and E. Mendes. Why comparative effort prediction studies may be invalid. In PROMISE '09: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pages 1--5, New York, NY, USA, 2009. ACM.
[11]
}}A. G. Koru, K. E. Emam, D. Zhang, H. Liu, and D. Mathew. Theory of relative defect proneness. Empirical Software Engineering, 13(5):473--498, 2008.
[12]
}}S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4):485--496, 2008.
[13]
}}T. Mende and R. Koschke. Revisiting the evaluation of defect prediction models. In PROMISE '09: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pages 1--10, New York, NY, USA, 2009.
[14]
}}T. Mende and R. Koschke. Effort-aware defect prediction models. In European Conference on Software Maintenance and Reengineering, pages 109--118, 2010.
[15]
}}T. Menzies, A. Dekhtyar, J. Distefano, and J. Greenwald. Problems with precision: A response to "comments on 'data mining static code attributes to learn defect predictors'". IEEE Transactions on Software Engineering, 33(9):637--640, 2007.
[16]
}}T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(1):2--13, 2007.
[17]
}}T. Menzies, Z. Milton, B. Turhan, B. Cukic, Y. Jiang, and A. Bener. Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering, 2010.
[18]
}}N. Nagappan, T. Ball, and A. Zeller. Mining metrics to predict component failures. In International Conference on Software Engineering, November 2006.
[19]
}}N. Ohlsson and H. Alberg. Predicting fault-prone software modules in telephone switches. IEEE Transactions on Software Engineering, 22(12):886--894, 1996.
[20]
}}T. Ostrand, E. Weyuker, and R. Bell. Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering, 31(4):34--355, 2005.
[21]
}}R Development Core Team. R: A Language and Environment for Statistical Computing.
[22]
}}T. Sing, O. Sander, N. Beerenwinkel, and T. Lengauer. ROCR: visualizing classifier performance in R. Bioinformatics, 21(20):3940--3941, 2005.
[23]
}}A. Tosun, B. Turhan, and A. Bener. Validation of network measures as indicators of defective modules in software systems. In PROMISE '09: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pages 1--9, New York, NY, USA, 2009. ACM.
[24]
}}H. Zhang and X. Zhang. Comments on "data mining static code attributes to learn defect predictors". IEEE Transactions on Software Engineering, 33(9):635--637, 2007.
[25]
}}T. Zimmermann and N. Nagappan. Predicting defects using network analysis on dependency graphs. In International Conference on Software Engineering, pages 531--540, New York, NY, USA, 2008. ACM.
[26]
}}T. Zimmermann, R. Premraj, and A. Zeller. Predicting defects for eclipse. In Predictor Models in Software Engineering, May 2007.

Cited By

View all
  • (2024)Smell-Aware Bug ClassificationIEEE Access10.1109/ACCESS.2023.333517512(14061-14082)Online publication date: 2024
  • (2023)Outlier Mining Techniques for Software Defect PredictionSoftware Quality: Higher Software Quality through Zero Waste Development10.1007/978-3-031-31488-9_3(41-60)Online publication date: 13-May-2023
  • (2022)An Empirical Study of Model-Agnostic Techniques for Defect Prediction ModelsIEEE Transactions on Software Engineering10.1109/TSE.2020.298238548:1(166-185)Online publication date: 1-Jan-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PROMISE '10: Proceedings of the 6th International Conference on Predictive Models in Software Engineering
September 2010
195 pages
ISBN:9781450304047
DOI:10.1145/1868328
  • General Chair:
  • Tim Menzies,
  • Program Chair:
  • Gunes Koru
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. defect prediction model
  2. replication

Qualifiers

  • Research-article

Conference

Promise '10

Acceptance Rates

PROMISE '10 Paper Acceptance Rate 19 of 53 submissions, 36%;
Overall Acceptance Rate 98 of 213 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Smell-Aware Bug ClassificationIEEE Access10.1109/ACCESS.2023.333517512(14061-14082)Online publication date: 2024
  • (2023)Outlier Mining Techniques for Software Defect PredictionSoftware Quality: Higher Software Quality through Zero Waste Development10.1007/978-3-031-31488-9_3(41-60)Online publication date: 13-May-2023
  • (2022)An Empirical Study of Model-Agnostic Techniques for Defect Prediction ModelsIEEE Transactions on Software Engineering10.1109/TSE.2020.298238548:1(166-185)Online publication date: 1-Jan-2022
  • (2022)The Impact of Duplicate Changes on Just-in-Time Defect PredictionIEEE Transactions on Reliability10.1109/TR.2021.306161871:3(1294-1308)Online publication date: Sep-2022
  • (2022)The Impact of Parameters Optimization in Software Prediction Models2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA56994.2022.00041(217-224)Online publication date: Aug-2022
  • (2022)Interpretability application of the Just-in-Time software defect prediction modelJournal of Systems and Software10.1016/j.jss.2022.111245188:COnline publication date: 1-Jun-2022
  • (2022)Classifying crowdsourced mobile test reports with image featuresJournal of Systems and Software10.1016/j.jss.2021.111121184:COnline publication date: 1-Feb-2022
  • (2022)ST-TLFInformation and Software Technology10.1016/j.infsof.2022.106939149:COnline publication date: 1-Sep-2022
  • (2022)Investigating replication challenges through multiple replications of an experimentInformation and Software Technology10.1016/j.infsof.2022.106870147:COnline publication date: 1-Jul-2022
  • (2021)A Large Scale Study of Long-Time Contributor Prediction for GitHub ProjectsIEEE Transactions on Software Engineering10.1109/TSE.2019.291853647:6(1277-1298)Online publication date: 1-Jun-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media