Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Validity and reliability of evaluation procedures in comparative studies of effort prediction models

Published: 01 February 2012 Publication History

Abstract

We have in previous studies reported our findings and concern about the reliability and validity of the evaluation procedures used in comparative studies on competing effort prediction models. In particular, we have raised concerns about the use of accuracy statistics to rank and select models. Our concern is strengthened by the observed lack of consistent findings. This study offers more insights into the causes of conclusion instability by elaborating on the findings of our previous work concerning the reliability and validity of the evaluation procedures. We show that model selection based on the accuracy statistics MMRE, MMER, MBRE, and MIBRE contribute to conclusion instability as well as selection of inferior models. We argue and show that the evaluation procedure must include an evaluation of whether the functional form of the prediction model makes sense to better prevent selection of inferior models.

References

[1]
Banker RD, Kemerer CF (1989) Scale Economies in New Software Development. IEEE Trans Software Eng 15(10):1199-1205.
[2]
Boehm BW (1981) Software Engineering Economics. Prentice-Hall, London.
[3]
Carmines EG and Zeller RA (1979) Reliability and Validity Assessment, Sage University Papers.
[4]
Conte SD, Dunsmore HE, Shen VY (1986) Software Engineering Metrics and Models. Benjamin/Cummings, Menlo Park.
[5]
Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A Simulation Study of the Model Evaluation Criterion MMRE. IEEE Trans Softw Eng 29(11):985-995.
[6]
Gujarati DN (2003) Basic Econometrics, 4ed, McGrawHill.
[7]
Hendry DF, Richard JF (1983) The Econometric Analysis of Economic Time series. International Statistics Review 51:3-33.
[8]
Jørgensen M, Shepperd M (2007) A Systematic Review of Software Development Cost Estimation Studies. IEEE Trans Softw Eng 33(1):33-53.
[9]
Kashigan SK (1991) Multivariate Statistical Analysis. A Conceptual Introduction, 2nd edn. Radius Press, New York.
[10]
Kitchenham B and Mendes E (2009) Why Comparative Effort Prediction Studies may be Invalid. Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE '09). 1-5.
[11]
Kitchenham BA, MacDonell SG, Pickard LM, Shepperd MJ (2001) What Accuracy Statistics Really Measure. IEE Proceedings Software 148(3):81-85.
[12]
Korte M and Port D (2008) Confidence in Software Cost Estimation Results based on MMRE and PRED. Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE'08). 63-70.
[13]
Miyazaki Y, Terakado M, Ozaki K, Nozaki H (1994) Robust Regression for Developing Software Estimation Models. J Syst Softw 27:3-16.
[14]
Myrtveit I, Stensrud E (1999) A Controlled Experiment to Assess the Benefits of Estimating with Analogy and Regression Models. IEEE Trans Softw Eng 25(4):510-525.
[15]
Myrtveit I, Stensrud E, Shepperd MJ (2005) Reliability and Validity in Comparative Studies of Software Prediction Models. IEEE Trans Softw Eng 31(5):380-391.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Empirical Software Engineering
Empirical Software Engineering  Volume 17, Issue 1-2
February 2012
127 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 February 2012

Author Tags

  1. Comparative studies
  2. Effort prediction
  3. Evaluation criteria
  4. MMRE
  5. Mean magnitude of relative error
  6. Software cost estimation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Ensembling Harmony Search Algorithm with case-based reasoning for software development effort estimationCluster Computing10.1007/s10586-024-04858-w28:2Online publication date: 1-Apr-2025
  • (2024)TSoptEE: two-stage optimization technique for software development effort estimationCluster Computing10.1007/s10586-024-04418-227:7(8889-8908)Online publication date: 1-Oct-2024
  • (2023)SEGRESS: Software Engineering Guidelines for REporting Secondary StudiesIEEE Transactions on Software Engineering10.1109/TSE.2022.317409249:3(1273-1298)Online publication date: 1-Mar-2023
  • (2020)Data‐driven benchmarking in software development effort estimationJournal of Software: Evolution and Process10.1002/smr.225832:9Online publication date: 3-Sep-2020
  • (2019)Software Development Effort Estimation Using Regression Fuzzy ModelsComputational Intelligence and Neuroscience10.1155/2019/83672142019Online publication date: 1-Jan-2019
  • (2019)A systematic literature review of software effort prediction using machine learning methodsJournal of Software: Evolution and Process10.1002/smr.221131:10Online publication date: 25-Oct-2019
  • (2018)Duplex output software effort estimation model with self-guided interpretationInformation and Software Technology10.1016/j.infsof.2017.09.01094:C(1-13)Online publication date: 1-Feb-2018
  • (2018)The state‐of‐the‐art in software development effort estimationJournal of Software: Evolution and Process10.1002/smr.198330:12Online publication date: 12-Dec-2018
  • (2017)Using bad learners to find good configurationsProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering10.1145/3106237.3106238(257-267)Online publication date: 21-Aug-2017
  • (2017)On the Evaluation of Effort Estimation ModelsProceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering10.1145/3084226.3084260(41-50)Online publication date: 15-Jun-2017
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media