Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Re-estimating software effort using prior phase efforts and data mining techniques

Published: 01 September 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Software effort estimation has played an important role in software project management. An accurate estimation helps reduce cost overrun and the eventual project failure. Unfortunately, many existing estimation techniques rely on the total project effort which is often determined from the project life cycle. As the project moves on, the course of action deviates from what originally has planned, despite close monitoring and control. This leads to re-estimating software effort so as to improve project operating costs and budgeting. Recent research endeavors attempt to explore phase level estimation that uses known information from prior development phases to predict effort of the next phase by using different learning techniques. This study aims to investigate the influence of preprocessing in prior phases on learning techniques to re-estimate the effort of next phase. The proposed re-estimation approach preprocesses prior phase effort by means of statistical techniques to select a set of input features for learning which in turn are exploited to generate the estimation models. These models are then used to re-estimate next phase effort by using four processing steps, namely data transformation, outlier detection, feature selection, and learning. An empirical study is conducted on 440 estimation models being generated from combinations of techniques on 5 data transformation, 5 outlier detection, 5 feature selection, and 5 learning techniques. The experimental results show that suitable preprocessing is significantly useful for building proper learning techniques to boosting re-estimation accuracy. However, there is no one learning technique that can outperform other techniques over all phases. The proposed re-estimation approach yields more accurate estimation than proportion-based estimation approach. It is envisioned that the proposed re-estimation approach can facilitate researchers and project managers on re-estimating software effort so as to finish the project on time and within the allotted budget.

    References

    [1]
    Wang Y, Song Q, MacDonell S, Shepperd M, Junyi S (2009) Integrate the GM (1,1) and verhulst models to predict software stage effort. IEEE Trans Syst Man Cybern Part C 39(6):647---658
    [2]
    Zia Z, Rashid A, uz Zaman K (2011) Software cost estimation for component-based fourth-generation-language software applications. IET Softw 5(1):103---110
    [3]
    Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng 32(11):883---895
    [4]
    Jorgensen M, Boehm B, Rifkin S (2009) Software development effort estimation: Formal models or expert judgment? IEEE Softw 26(2):14---19
    [5]
    MacDonell SG, Shepperd MJ (2003) Using prior-phase effort records for re-estimation during software projects. In: Proceedings of the ninth international software metrics symposium (METRICS'03), pp 73---86
    [6]
    Azzeh M, Cowling PI, Neagu D (2010) Software stage-effort estimation based on association rule mining and fuzzy set theory. In: Proceedings of 2010 IEEE 10th international conference on computer and information technology (CIT), pp 249---256
    [7]
    Ferrucci F, Gravino C, Sarro F (2014) Exploiting prior-phase effort data to estimate the effort for the subsequent phases: a further assessment. In: Proceedings of the 10th international conference on predictive models in software engineering, PROMISE '14, pp 42---51. ACM, New York, NY, USA
    [8]
    Kantardzic M (2011) Data Mining: Concepts, Models, Methods, and Algorithms. Wiley, Piscataway
    [9]
    Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403---1416
    [10]
    Boehm BW (1981) Software engineering economics. Prentice Hall PTR, Upper Saddle River
    [11]
    Yucalar F, Kilinc D, Borandag E, Ozcift A (2016) Regression analysis based software effort estimation method. Int J Softw Eng Knowl Eng 26(05):807---826
    [12]
    Huang SJ, Chiu NH, Liu YJ (2008) A comparative evaluation on the accuracies of software effort estimates from clustered data. Inf Softw Technol 50(9---10):879---888
    [13]
    Putnam LH (1978) A general empirical solution to the macro software sizing and estimating problem. IEEE Trans Softw Eng SE---4(4):345---361
    [14]
    Boehm BW, Abts C, Brown AW, Chulani S, Clark BK, Horowitz E, Madachy R, Reifer D, Steece B (2000) Software cost estimation with COCOMO II. Prentice Hall PTR, Upper Saddle River
    [15]
    Liu Q, Qin W, Mintram R, Ross M (2008) Evaluation of preliminary data analysis framework in software cost estimation based on ISBSG R9 data. Softw Qual J 16:411---458
    [16]
    Kocaguneli E, Menzies T, Bener A, Keung JW (2012) Exploiting the essential assumptions of analogy-based effort estimation. IEEE Trans Softw Eng 38(2):425---438
    [17]
    Idri A, Amazal F, Abran A (2015) Analogy-based software development effort estimation: a systematic mapping and review. Inf Softw Technol 58:206---230
    [18]
    Kumar KV, Ravi V, Carr M, Kiran NR (2008) Software development cost estimation using wavelet neural networks. J Syst Softw 81(11):1853---1867
    [19]
    Huang SJ, Chiu NH (2009) Applying fuzzy neural network to estimate software development effort. Appl Intell 30:73---83
    [20]
    Oliveira ALI (2006) Estimation of software project effort with support vector regression. Neurocomputing 69(13---15):1749---1753
    [21]
    Corazza A, Martino SD, Ferrucci F, Gravino C, Mendes E (2011) Investigating the use of support vector regression for web effort estimation. Empir Softw Eng 16:211---243
    [22]
    Mittal A, Parkash K, Mittal H (2010) Software cost estimation using fuzzy logic. SIGSOFT Softw Eng Notes 35(1):1---7
    [23]
    Muzaffar Z, Ahmed MA (2010) Software development effort prediction: a study on the factors impacting the accuracy of fuzzy logic systems. Inf Softw Technol 52(1):92---109
    [24]
    Oliveira AL, Braga PL, Lima RM, Cornlio ML (2010) GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation. Inf Softw Technol 52(11):1155---1166 Special Section on Best Papers PROMISE 2009
    [25]
    Minku LL, Yao X (2013) Software effort estimation as a multiobjective learning problem. ACM Trans Softw Eng Methodol 22(4):35:1---35:32
    [26]
    Jrgensen M (2004) A review of studies on expert estimation of software development effort. J Syst Softw 70(12):37---60
    [27]
    Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21---45
    [28]
    Tan HBK, Zhao Y, Zhang H (2009) Conceptual data model-based software size estimation for information systems. ACM Trans Softw Eng Methodol 19(2):4:1---4:37
    [29]
    Malik AA, Boehm BW (2011) Quantifying requirements elaboration to improve early software cost estimation. Inf Sci 181(13):2747---2760
    [30]
    Yang Y, He M, Li M, Wang Q, Boehm BW (2008) Phase distribution of software development effort. In: Proceedings of the second ACM-IEEE international symposium on empirical software engineering and measurement, ESEM '08, pp 61---69. ACM, New York, NY, USA
    [31]
    Strike K, Emam KE, Madhavji N (2001) Software cost estimation with incomplete data. IEEE Trans Softw Eng 27(10):890---908
    [32]
    Azzeh M, Neagu D, Cowling P (2008) Improving analogy software effort estimation using fuzzy feature subset selection algorithm. In: Proceedings of the 4th international workshop on predictor models in software engineering, PROMISE '08, pp 71---78. ACM, New York, NY, USA
    [33]
    Pai DR, McFall KS, Subramanian GH (2013) Software effort estimation using a neural network ensemble. J Comput Inf Syst 53(4):4958
    [34]
    Dejaeger K, Verbeke W, Martens D, Baesens B (2012) Data mining techniques for software effort estimation: A comparative study. IEEE Trans Softw Eng 38(2):375---397
    [35]
    Sakia R (1992) The Box-Cox transformation technique: a review. J R Stat Soc Ser D 41(2):169---178
    [36]
    Junling R (2006) A pattern selection algorithm based on the generalized confidence. In: Proceedings of 18th international conference on pattern recognition (ICPR'06), vol. 2, pp. 824---827
    [37]
    Huang SJ, Chiu NH, Chen LW (2008) Integration of the grey relational analysis with genetic algorithm for software effort estimation. Eur J Oper Res 188(3):898---909
    [38]
    Jarque CM (2011) International encyclopedia of statistical science. Jarque---Bera test part. Springer, Berlin
    [39]
    Cook RD (1977) Detection of influential observation in linear regression. Technometrics 19(1):15---18
    [40]
    Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2---13
    [41]
    Malhotra R, Kaur A, Singh Y (2010) Application of machine learning methods for software effort prediction. SIGSOFT Softw Eng Notes 35(3):1---6
    [42]
    Chen Z, Menzies T, Port D, Boehm BW (2005) Finding the right data for software cost modeling. IEEE Softw 22(6):38---46
    [43]
    Hall MA (1999) Correlation-based feature selection for machine learning Ph.D. thesis, doctors thesis, Department of Computer Science, Waikato University. The bibliography
    [44]
    Unified code count. http://sunset.usc.edu/ucc/, accessed 9 (November 2015)
    [45]
    Backfiring table conversion guidelines. http://www.qsm.com/resources/function-point-languages-table/, accessed 9 (November 2015)
    [46]
    Conte SD, Dunsmore HE, Shen VY (1981) Software engineering metrics and models. Benjamin-Cummings, Menlo Park
    [47]
    Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A Simulation study of the model evaluation criterion MMRE. IEEE Trans Softw Eng 29(11):985---995
    [48]
    Shepperd M, MacDonell S (2012) Evaluating prediction systems in software project estimation. Inf Softw Technol 54(8):820---827
    [49]
    Miyazaki Y, Terakado M, Ozada K, Nozaki H (1994) Robust regression for developing software estimation models. J Syst Softw 27(1):3---16
    [50]
    Jorgensen M (2010) Selection of strategies in judgment-based effort estimation. J Syst Softw 83(6):1039---1050
    [51]
    Kocaguneli E, Menzies T, Keung J, Cok D, Madachy R (2013) Active learning and effort estimation: finding the essential content of software effort estimation data. IEEE Trans Softw Eng 39(8):1040---1053
    [52]
    Refaeilzadeh P, Tang L, Liu L (2009) Encyclopedia of database systemscross validation. Springer, New York
    [53]
    Menzies T, Caglayan B, He Z, Kocaguneli E, Krall J, Peters F, Turhan B (2012) The promise repository of empirical software engineering data, http://promisedata.googlecode.com

    Cited By

    View all
    • (2021)Software Testing Effort Estimation and Related ProblemsACM Computing Surveys10.1145/344269454:3(1-38)Online publication date: 17-Apr-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Innovations in Systems and Software Engineering
    Innovations in Systems and Software Engineering  Volume 14, Issue 3
    September 2018
    87 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 September 2018

    Author Tags

    1. Data transformation
    2. Feature selection
    3. Learning
    4. Outlier detection
    5. Prior phase effort
    6. Re-estimating software effort

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Software Testing Effort Estimation and Related ProblemsACM Computing Surveys10.1145/344269454:3(1-38)Online publication date: 17-Apr-2021

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media