Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2884781.2884830acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Multi-objective software effort estimation

Published: 14 May 2016 Publication History
  • Get Citation Alerts
  • Abstract

    We introduce a bi-objective effort estimation algorithm that combines Confidence Interval Analysis and assessment of Mean Absolute Error. We evaluate our proposed algorithm on three different alternative formulations, baseline comparators and current state-of-the-art effort estimators applied to five real-world datasets from the PROMISE repository, involving 724 different software projects in total. The results reveal that our algorithm outperforms the baseline, state-of-the-art and all three alternative formulations, statistically significantly (p < 0.001) and with large effect size (Â12 ≥ 0.9) over all five datasets. We also provide evidence that our algorithm creates a new state-of-the-art, which lies within currently claimed industrial human-expert-based thresholds, thereby demonstrating that our findings have actionable conclusions for practicing software engineers.

    References

    [1]
    L. Angelis and I. Stamelos. A simulation tool for efficient analogy based cost estimation. EMSE, 5(1):35--68, 2000.
    [2]
    L. Angelis, I. Stamelos, and M. Morisio. Building A software cost estimation model based on categorical data. In Proc. of METRICS'01, pages 4--15, 2001.
    [3]
    A. Arcuri and L. C. Briand. A hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. STVR, 24(3):219--250, 2014.
    [4]
    A. Bakir, B. Turhan, and A. Bener. A comparative study for estimating software development effort intervals. SQJ, 19(3):537--552, 2011.
    [5]
    M. Barros. An analysis of the effects of composite objectives in multiobjective software module clustering. In Proc. of GECCO '12, pages 1205--1212, 2012.
    [6]
    S. Bibi, I. Stamelos, and E. Angelis. Software Cost Prediction with Predefined Interval Estimates. In Proc. of SMEF'04, pages 237--246, 2004.
    [7]
    P. Braga, A. Oliveira, and S. Meira. Software effort estimation using machine learning techniques with robust confidence intervals. In Proc. of HIS'07, pages 352--357, 2007.
    [8]
    L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Statistics/Probability Series. Wadsworth Publishing Company, Belmont, California, U.S.A., 1984.
    [9]
    L. C. Briand and I. Wieczorek. Software resource estimation. Encyclopedia of Software Engineering, pages 1160--1196, 2002.
    [10]
    L. C. Briand and J. Wüst. Modeling development effort in object-oriented systems using design properties. IEEE TSE, 27(11):963--986, 2001.
    [11]
    C. J. Burgess and M. Lefley. Can genetic programming improve software effort estimation? a comparative evaluation. IST, 43(14):863--873, 2001.
    [12]
    J. Cohen. Statistical power analysis for the behavioral sciences. Lawrence Earlbaum Associates, 2nd edition, 1988.
    [13]
    T. E. Colanzi, S. R. Vergilio, W. K. G. Assuncao, and A. Pozo. Search based software engineering: Review and analysis of the field in Brazil. JSS, 86(4):970--984, 2013.
    [14]
    D. Conte, H. Dunsmore, and V. Shen. Software engineering metrics and models. Benjamin/Cummings Publishing Company, Inc., 1986.
    [15]
    A. Corazza, S. Di Martino, F. Ferrucci, C. Gravino, F. Sarro, and E. Mendes. How effective is tabu search to configure support vector regression for effort estimation? In Proc. of PROMISE'10, pages 4:1--4:10, 2010.
    [16]
    A. Corazza, S. D. Martino, F. Ferrucci, C. Gravino, F. Sarro, and E. Mendes. Using tabu search to configure support vector regression for effort estimation. EMSE, 18(3):506--546, 2013.
    [17]
    K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE TEC, 6:182--197, 2002.
    [18]
    J. J. Dolado. A validation of the component-based method for software size estimation. IEEE TSE, 26(10):1006--1021, 2000.
    [19]
    F. Ferrucci, C. Gravino, R. Oliveto, and F. Sarro. Using tabu search to estimate software development effort. In Proc. of MENSURA'09, pages 307--320. LNCS 5891, Springer, 2009.
    [20]
    F. Ferrucci, C. Gravino, R. Oliveto, and F. Sarro. Genetic programming for effort estimation: An analysis of the impact of different fitness functions. In Proc. of SSBSE'10, pages 89--98, 2010.
    [21]
    F. Ferrucci, C. Gravino, R. Oliveto, F. Sarro, and E. Mendes. Investigating tabu search for web effort estimation. In Proc. of EUROMICRO-SEAA'10, pages 350--357, 2010.
    [22]
    F. Ferrucci, C. Gravino, and F. Sarro. How multi-objective genetic programming is effective for software development effort estimation? In Proc. of SSBSE'11, pages 274--275, 2011.
    [23]
    F. Ferrucci, M. Harman, J. Ren, and F. Sarro. Not going to take this anymore: Multi-objective overtime planning for software engineering projects. In Proc. of ICSE'13, 2013.
    [24]
    F. Ferrucci, M. Harman, and F. Sarro. Search-based software project management. In Software Project Management in a Changing World, pages 373--399. Springer, 2014.
    [25]
    T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit. A simulation study of the model evaluation criterion MMRE. IEEE TSE, 29(11):985--995, 2003.
    [26]
    F. G. Freitas and J. T. Souza. Ten years of search based software engineering: A bibliometric analysis. In Proc. of SSBSE'11, pages 18--32, 2011.
    [27]
    M. Harman. The current state and future of search based software engineering. In Proc. of FOSE'07, pages 342--357, 2007.
    [28]
    M. Harman, Y. Jia, and Y. Zhang. Achievements, open problems and challenges for search based software testing (keynote). In Proc. of ICST'14, 2014.
    [29]
    M. Harman and B. F. Jones. Search based software engineering. IST, 43(14):833--839, 2001.
    [30]
    M. Harman, A. Mansouri, and Y. Zhang. Search based software engineering: Trends, techniques and applications. ACM Computing Surveys, 45(1):11:1--11:61, 2012.
    [31]
    M. Harman, P. McMinn, J. Teixeira de Souza, and S. Yoo. Search based software engineering: Techniques, taxonomy, tutorial. In LASER, pages 1--59, 2010.
    [32]
    G. W. Hill. Algorithm 396: Student's t-quantiles. Commun. ACM, 13(10):619--620, 1970.
    [33]
    S.-J. Huang and N.-H. Chiu. Optimization of analogy weights by genetic algorithm for software effort estimation. JSS, 48(11):1034--1045, 2006.
    [34]
    R. Jeffery, M. Ruhe, and I. Wieczorek. A comparative study of cost modelling techniques using public domain multi-organisational and company-specific data. In Proc. of ESCOM'2000, 2000.
    [35]
    M. Jørgensen. Comments on 'A simulation tool for efficient analogy based cost estimation'. EMSE, 7(4):375--376, 2002.
    [36]
    M. Jørgensen. The ignorance of confidence levels in minimum-maximum software development effort interval. LNSE, 2(4):327--330, 2004.
    [37]
    M. Jørgensen. A review of studies on expert estimation of software development effort. JSS, 70(1-2):37--60, 2004.
    [38]
    M. Jørgensen and K. Moløkken. Combination of software development effort prediction intervals: Why, when and how? In Proc. of SEKE'02, pages 425--428, 2002.
    [39]
    M. Jørgensen and M. Shepperd. A systematic review of software development cost estimation studies. IEEE TSE, 33(1):33--53, 2007.
    [40]
    M. Jørgensen and D. Sjöberg. An effort prediction interval approach based on the empirical distribution of previous estimation accuracy. IST, 45(3):123--136, 2003.
    [41]
    M. Jørgensen, K. H. Teigen, and K. Moløkken. Better sure than safe? over-confidence in judgement based software development effort prediction intervals. JSS, 70(1-2):79--93, 2004.
    [42]
    G. Kadoda, M. Cartwright, and M. Shepperd. Issues on the effective use of cbr technology for software project prediction. In Case-Based Reasoning Research and Development, LNCS v. 2080, pages 276--290. 2001.
    [43]
    G. Kadoda and M. Shepperd. Using simulation to evaluate predictions techniques. In Proc. of Int. Software Metrics Symposium, pages 349--358. IEEE press, 2001.
    [44]
    B. Kitchenham, L. Pickard, and S. Pfleeger. Case studies for method and tool evaluation. IEEE Software, 12(4):52--62, 1995.
    [45]
    B. Kitchenham, L. M. Pickard, S. G. MacDonell, and M. J. Shepperd. What accuracy statistics really measure. IEEE Proc. Software, 148(3):81--85, 2001.
    [46]
    J. D. Knowles, L. Thiele, and E. Zitzler. A tutorial on the performance assessment of stochastic multiobjective optimizers. Technical Report 214, ETH Zurich, 2006.
    [47]
    E. Kocaguneli, T. Menzies, A. Bener, and J. Keung. Exploiting the essential assumptions of analogy-based effort estimation. IEEE TSE, 38(2):425--438, 2012.
    [48]
    E. Kocaguneli, T. Menzies, J. Hihn, and B. H. Kang. Size doesn't matter?: On the value of software size features for effort estimation. In Proc. of PROMISE'12, pages 89--98, 2012.
    [49]
    E. Kocaguneli, T. Menzies, J. Keung, D. Cok, and R. Madachy. Active learning and effort estimation: Finding the essential content of software effort estimation data. IEEE TSE, 39(8):1040--1053, 2013.
    [50]
    E. Kocaguneli, T. Menzies, and J. W. Keung. On the value of ensemble effort estimation. IEEE TSE, 38(6):1403--1416, 2012.
    [51]
    E. Kocaguneli, A. Misirli, B. Caglayan, and A. Bener. Experiences on developer participation and effort estimation. In Proc. of EUROMICRO-SEAA'11, pages 419--422, 2011.
    [52]
    E. Kocaguneli, A. Tosun, and A. Bener. Ai-based models for software effort estimation. In Proc. of EUROMICRO-SEAA'10, pages 323--326, 2010.
    [53]
    M. Korte and D. Port. Confidence in software cost estimation results based on mmre and pred. In Proc. of PROMISE'08, pages 63--70, 2008.
    [54]
    J. R. Koza. Genetic Programming. MIT Press, 1992.
    [55]
    W. B. Langdon, J. Dolado, F. Sarro, and M. Harman. Exact mean absolute error of baseline predictor, MARP0. IST, 73:16--18, 2016.
    [56]
    M. Lefley and M. J. Shepperd. Using genetic programming to improve software effort estimation based on general data sets. In Proc. of GECCO'03, pages 2477--2487, 2003.
    [57]
    C. Lokan. What should you optimize when building an estimation model? In Proc. of METRICS'05, page 34, 2005.
    [58]
    C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. Schofield, M. Shepperd, and S. Webster. An investigation of machine learning based prediction systems. JSS, 53(1):23--29, 2000.
    [59]
    S. McConnell. Software Estimation: Demystifying the Black Art. Microsoft Press, 2006.
    [60]
    E. Mendes, S. Counsell, N. Mosley, C. Triggs, and I. Watson. A comparative study of cost estimation models for web hypermedia applications. EMSE, 8(23):163--196, 2003.
    [61]
    E. Mendes and B. Kitchenham. A comparison of cross-company and within-company effort estimation models for web applications. In Proc. of EASE'04, pages 47--55, 2004.
    [62]
    E. Mendes and N. Mosley. Further investigation into the use of cbr and stepwise regression to predict development effort for web hypermedia applications. In Proc. of Int. Symposium on Empirical Software Engineering, pages 79--90, 2002.
    [63]
    T. Menzies, Z. Chen, J. Hihn, and K. Lum. Selecting best practices for effort estimation. IEEE TSE, 32(11):883--895, 2006.
    [64]
    T. Menzies, M. Rees-Jones, R. Krishna, and C. Pape. The promise repository of empirical software engineering data, 2015.
    [65]
    T. Menzies and M. Shepperd. Special issue on repeatable results in software engineering prediction. EMSE, 17(1):1--17, 2012.
    [66]
    L. L. Minku and X. Yao. Software effort estimation as a multiobjective learning problem. ACM TOSEM, 22(4):35, 2013.
    [67]
    K. Molkken and M. Jörgensen. A review of surveys on software effort estimation. In Proc. of ISESE'03, pages 223--230, 2003.
    [68]
    S. Nejati and L. C. Briand. Identifying optimal trade-offs between cpu time usage and temporal constraints using search. In Proc. of ISSTA'14, pages 351--361, 2014.
    [69]
    G. Neumann, M. Harman, and S. M. Poulding. Transformed vargha-delaney effect size. In Proc. of SSBSE'15, pages 318--324, 2015.
    [70]
    R. Olaechea, D. Rayside, J. Guo, and K. Czarnecki. Comparison of exact and approximate multi-objective optimization for software product lines. In Proc. of SPLC'14, pages 92--101, 2014.
    [71]
    D. Port and M. Korte. Comparative studies of the model evaluation criterions mmre and pred in software cost estimation research. In Proc. of ESEM'08, pages 51--60, 2008.
    [72]
    K. Praditwong, M. Harman, and X. Yao. Software module clustering as a multi-objective search problem. IEEE TSE, 37(2):264--282, 2011.
    [73]
    P. Royston. An extension of Shapiro and Wilk's W test for normality to large samples. Applied Statistics, 31(2):115--124, 1982.
    [74]
    F. Sarro, F. Ferrucci, and C. Gravino. Single and multi objective genetic programming for software development effort estimation. In Proc. of ACM SAC'12, pages 1221--1226, 2012.
    [75]
    P. Sentas, L. Angelis, and I. Stamelos. Multinomial logistic regression applied on software productivity prediction. In 9th Panhellenic Conf. in Inf., 2003.
    [76]
    P. Sentas, L. Angelis, I. Stamelos, and G. Bleris. Software productivity and effort prediction with ordinal regression. IST, 47(1):17--29, 2005.
    [77]
    Y. Shan, R. I. Mckay, C. J. Lokan, and D. L. Essam. Software project effort estimation using genetic programming. In Proc. of CCS'02, pages 1108--1112, 2002.
    [78]
    M. Shepperd. Case-based reasoning and software engineering. In Managing Software Engineering Knowledge, pages 181--198. Springer, 2003.
    [79]
    M. Shepperd, M. Cartwright, and G. Kadoda. On building prediction systems for software engineers. EMSE, 5(3):175--182, 2000.
    [80]
    M. Shepperd and C. Schofield. Estimating Software Project Effort using Analogies. IEEE TSE, 23(11):736--743, 1997.
    [81]
    M. Shepperd and C. Schofield. Estimating software project effort using analogies. IEEE TSE, 23(11):736--743, 2000.
    [82]
    M. J. Shepperd and S. G. MacDonell. Evaluating prediction systems in software project estimation. IST, 54(8):820--827, 2012.
    [83]
    D. L. Shrestha and D. P. Solomatine. Machine learning approaches for estimation of prediction interval for the model output. Neural Networks, 19(2):225--235, 2006.
    [84]
    I. Sommerville. Software Engineering. Pearson, 9th edition, 2010.
    [85]
    I. Stamelos and L. Angelis. Managing uncertainty in project portfolio cost estimation. IST, 43(13):759--768, 2001.
    [86]
    I. Stamelos, L. Angelis, P. Dimou, and E. Sakellaris. On the use of bayesian belief networks for the prediction of software productivity. IST, 45(1):51--60, 2003.
    [87]
    E. Stensrud, T. Foss, B. Kitchenham, and I. Myrtveit. A further empirical investigation of the relationship between MRE and project size. EMSE, 8(2):139--161, 2003.
    [88]
    A. Trendowicz. Software Project Effort Estimation: Foundations and Best Practice Guidelines for Success. Springer, 2014.
    [89]
    D. A. V. Veldhuizen and G. B. Lamont. Multiobjective evolutionary algorithm research: A history and analysis, 1998.
    [90]
    P. A. Whigham, C. A. Owen, and S. G. Macdonell. A baseline model for software effort estimation. ACM TOSEM, 24(3):20:1--20:11, 2015.
    [91]
    E. Zitzler and L. Thiele. Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE TEC, 3(4):257--271, 1999.
    [92]
    E. Zitzler, L. Thiele, M. Laumanns, C. Fonseca, and V. da Fonseca. Performance assessment of multiobjective optimizers: an analysis and review. IEEE TEC, 7(2):117--132, 2003.

    Cited By

    View all
    • (2024)Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project HealthACM Transactions on Software Engineering and Methodology10.1145/363025233:3(1-22)Online publication date: 14-Mar-2024
    • (2024)Fine-SE: Integrating Semantic Features and Expert Features for Software Effort EstimationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623349(1-12)Online publication date: 20-May-2024
    • (2024)Agile Effort Estimation: Have We Solved the Problem Yet? Insights From the Replication of the GPT2SP Study2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00111(1034-1041)Online publication date: 12-Mar-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICSE '16: Proceedings of the 38th International Conference on Software Engineering
    May 2016
    1235 pages
    ISBN:9781450339001
    DOI:10.1145/2884781
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 May 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. confidence interval
    2. estimates uncertainty
    3. multi-objective evolutionary algorithm
    4. software effort estimation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICSE '16
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 276 of 1,856 submissions, 15%

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)168
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project HealthACM Transactions on Software Engineering and Methodology10.1145/363025233:3(1-22)Online publication date: 14-Mar-2024
    • (2024)Fine-SE: Integrating Semantic Features and Expert Features for Software Effort EstimationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623349(1-12)Online publication date: 20-May-2024
    • (2024)Agile Effort Estimation: Have We Solved the Problem Yet? Insights From the Replication of the GPT2SP Study2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00111(1034-1041)Online publication date: 12-Mar-2024
    • (2024)On The Effectiveness of One-Class Support Vector Machine in Different Defect Prediction Scenarios2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00061(535-545)Online publication date: 12-Mar-2024
    • (2024)Computational intelligence for estimating software development effort: a systematic mapping studyIran Journal of Computer Science10.1007/s42044-024-00178-9Online publication date: 9-Apr-2024
    • (2024)Micro Frontend Based Performance Improvement and Prediction for Microservices Using Machine LearningJournal of Grid Computing10.1007/s10723-024-09760-822:2Online publication date: 16-Apr-2024
    • (2024)Search-based Automatic Repair for Fairness and Accuracy in Decision-making SoftwareEmpirical Software Engineering10.1007/s10664-023-10419-329:1Online publication date: 3-Jan-2024
    • (2024)Software Effort Estimation Using Deep Learning: A Gentle ReviewArtificial Intelligence and Sustainable Computing10.1007/978-981-97-0327-2_26(351-364)Online publication date: 24-Apr-2024
    • (2024)Regression test prioritization leveraging source code similarity with tree kernelsJournal of Software: Evolution and Process10.1002/smr.2653Online publication date: 15-Feb-2024
    • (2023)Dynamic Prediction of Delays in Software Projects using Delay Patterns and Bayesian ModelingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616328(1012-1023)Online publication date: 30-Nov-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media