Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2639490.2639510acmotherconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
research-article
Open access

The potential benefit of relevance vector machine to software effort estimation

Published: 17 September 2014 Publication History

Abstract

Three key challenges faced by the task of software effort estimation (SEE) when using predictive models are: (1) in order to support decision-making, software managers should have access not only to the effort estimation given by the predictive model, but also how confident this model is in estimating a given project and how likely other effort values could be the real efforts required to develop this project, (2) SEE data is likely to contain noise, due to the participation of humans in the data collection, and this noise can hinder predictions if not catered, and (3) data collection is an expensive task, and guidelines on when new data need to be collected would be helpful for reducing the cost associated with data collection. However, even though SEE has been studied for decades and many predictors have been proposed, few methods focus on these issues. In this work, we show that relevance vector machine (RVM) is a promising predictive method for addressing these three challenges. More specifically, it explicitly handles noise, it provides probabilistic predictions of effort, and can be used to identify when the required efforts of new projects should be collected for using them as training examples. With that in mind, this work provides the first step in exploiting RVM's potential for SEE by validating both its point prediction and prediction intervals. It then explains in detail future directions in terms of how RVMs can be further exploited for addressing the above mentioned challenges. Our systematic experiments show that RVM is very competitive compared with state-of-the-art SEE approaches, being usually ranked the first or second in 7 across 11 data sets in terms of mean absolute error. We also demonstrate how RVM can be used to judge the amount of noise present in the data. In summary, we show that RVM is a very promising predictor for SEE and should be further exploited.

References

[1]
Engineering statistics, nist/sematech e-handbook of statistical methods. http://www.itl.nist.gov/div898/handbook/, 2012.
[2]
L. Angelis and I. Stamelos. A simulation tool for efficient analogy based cost estimation. Empirical Software Engineering, 5(1): 35--68, 2000.
[3]
J. Armstrong. "The forecasting dictionary," in J. S. Armstrong (Ed.), Principles of forecasting: a handbook for researchers and practitioners. Kluwer, 2001.
[4]
B. Baskeles, B. Turhan, and A. Bener. Software effort estimation using machine learning methods. In Computer and information sciences, 22nd international symposium on, pages 1--6, 2007.
[5]
C. Bishop. Pattern Recgonition and Machine Learning. Springer, 2006.
[6]
I. Borg and F. Groenen. Modern Multidimensional Scaling - Theory and Applications. Springer, 2005.
[7]
L. P. Braga, A. Oliveira, G. Ribeiro, and S. Meira. Bagging predictors for estimation of software project effort. In International Joint Conference on Neural Networks, pages 1595--1600, Orlando, 2007.
[8]
C. Briand, E. Emam, D. Surmann, I. Wieczorek, and D. Maxwell. An assessment and comparison of common software cost estimation modelling techniques. In ICSE, pages 313--322, New York, USA, 1999.
[9]
S. Chulani, B. Boehm, and B. Steece. Bayesian analysis of empirical software engineering cost models. IEEE TSE, 25(4): 573--583, 1999.
[10]
K. Dejaeger, W. Verbeke, D. Martens, and B. Baesens. Data mining techniques for software effort estimation: A comparative study. IEEE, TSE, 38(2): 375--397, 2012.
[11]
A. Faul and M. Tipping. Analysis of sparse bayesian learning. In Advances in Neural Information Processing Systems 14, pages 383--389. MIT Press, 2001.
[12]
T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit. A simulation study of the model evaluation criterion mmre. IEEE TSE, 29: 2003, 2003.
[13]
M. Harman, F. Ferruci, and F. Sarro. Search-based software project management. In G. Ruhe and C. Wholin, editors, Software Project Management in a Changing World. Springer, 2014.
[14]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning - Data Mining, Inference, and Prediction. Springer, 2001.
[15]
ISBSG. The international software benchmarking standards group. http://www.isbsg.org, 2011.
[16]
M. Jorgensen. Evidence-based guidelines for assessment of software development cost uncertainty. IEEE, TSE, 32(11): 942--954, 2005.
[17]
M. Jorgensen and K. Sjoberg. An effort prediction interval approach based on the empirical distribution of previous estimation accuracy. IST, 45(3): 123--136, 2003.
[18]
M. Jorgensen, K. H. Teigen, and K. Molokken. Better sure than safe? overconfidence in judgement based software development effort prediction intervals. Journal of systems and software, 70: 79--93, 2004.
[19]
M. Klas, A. Trendowizc, Y. Ishigai, and H. Nakao. Handling estimation uncertainty with bootstrapping: empirical evaluation in the context of hybrid prediction methods. In ESEM, pages 245--254, 2011.
[20]
E. Kocaguneli, B. Cukic, and H. Lu. Predicting more from less: Synergies of learning. In RAISE, San Francisco, CA, USA, 2013.
[21]
E. Kocaguneli, B. Cukic, T. Menzies, and H. Lu. Building a second opinion: Learning cross-company data. In PROMISE, Baltimore, USA, 2013.
[22]
E. Kocaguneli and T. Menzies. How to find relevant data for effort estimation. In ESEM, pages 321--324, 2011.
[23]
E. Kocaguneli, T. Menzies, and J. Keung. On the value of ensemble effort estimation. IEEE, TSE, 38(6): 1403--1416, 2012.
[24]
E. Kocaguneli, T. Menzies, J. Keung, D. Cok, and R. Madachy. Active learning and effort estimation: Finding the essential content of software effort estimation data. IEEE TSE, 39(8), 2012.
[25]
Y. Kultur, B. Turhan, and A. Bener. Ensemble of neural networks with associative memory (ENNA) for estimating software development costs. Knowledge-Based Systems, 22: 395--402, 2009.
[26]
E. Mendes and N. Mosley. Bayesian network models for web effort prediction: A comparative study. IEEE TSE, 34(6): 723--737, 2008.
[27]
L. Minku and X. Yao. Ensembles and locality: Insight on improving software effort estimation. IST, 55(8): 1512--1528, 2012.
[28]
L. Minku and X. Yao. Software effort estimation as a multiobjective learning problem. TOSEM, 22(4): 1--32, 2013.
[29]
L. Minku and X. Yao. How to make best use of cross-company data in software effort estimation? In ICSE, pages 446--456, 2014.
[30]
R. M. Neal. Bayesian Learning for Neural Networks. Springer, 1996.
[31]
C. Pendharkar, H. Subramanian, and A. Rodger. A probabilistic model for predicting software development effort. IEEE TSE, 31(7): 615--624, 2005.
[32]
J. Sayyad Shirabad and T. Menzies. The repository of software engineering databases. School of Information Technology and Engineering, University of Ottawa, Canada, 2005.
[33]
P. Sentas, L. Angelis, I. Stamelos, and G. Bleris. Software productivity and effort prediction with ordinal regression. IST, 47(1): 17--29, 2005.
[34]
M. Shepperd and S. McDonell. Evaluating prediction systems in software project estimation. IST, 54: 820--827, 2012.
[35]
M. Shepperd and C. Schofield. Estimating software project effort using analogies. IEEE, TSE, 23(12): 736--743, 1997.
[36]
L. Song, L. Minku, and X. Yao. The impact of parameter tuning on software effort estimation using learning machines. In PROMISE, Baltimore, USA, 2013.
[37]
I. Stamelos and L. Angelis. Managing uncertainty in project portfolio cost estimation. IST, 43: 759--768, 2001.
[38]
I. Stamelos, L. Angelis, P. Dimou, and E. Sakellaris. On the use of bayesian belief network for prediction of software productivity. IST, 45(1): 51--60, 2003.
[39]
M. Tipping. Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning, 1: 211--244, 2001.
[40]
J. Wei, S. Li, Z. Lin, Y. Hu, and C. Huang. Systematic literature review of machine learning based software development effort estimation models. IST, 54(1): 41--59, 2012.
[41]
B. Wlodzimierz. The Normal Distribution: Characterizations with Applications. Springer-Verlag, 1995.

Cited By

View all
  • (2023)Software Cost and Effort Estimation: Current Approaches and Future TrendsIEEE Access10.1109/ACCESS.2023.331271611(99268-99288)Online publication date: 2023
  • (2023)Artificial Intelligence in Software Project ManagementOptimising the Software Development Process with Artificial Intelligence10.1007/978-981-19-9948-2_2(19-65)Online publication date: 20-Jul-2023
  • (2022)Multi-Objective Software Effort Estimation: A Replication StudyIEEE Transactions on Software Engineering10.1109/TSE.2021.308336048:8(3185-3205)Online publication date: 1-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PROMISE '14: Proceedings of the 10th International Conference on Predictive Models in Software Engineering
September 2014
98 pages
ISBN:9781450328982
DOI:10.1145/2639490
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 September 2014

Check for updates

Author Tags

  1. data collection guidance
  2. effort noise
  3. machine learning
  4. prediction interval
  5. relevance vector machine
  6. software effort estimation

Qualifiers

  • Research-article

Funding Sources

Conference

PROMISE '14

Acceptance Rates

PROMISE '14 Paper Acceptance Rate 9 of 21 submissions, 43%;
Overall Acceptance Rate 98 of 213 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)14
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Software Cost and Effort Estimation: Current Approaches and Future TrendsIEEE Access10.1109/ACCESS.2023.331271611(99268-99288)Online publication date: 2023
  • (2023)Artificial Intelligence in Software Project ManagementOptimising the Software Development Process with Artificial Intelligence10.1007/978-981-19-9948-2_2(19-65)Online publication date: 20-Jul-2023
  • (2022)Multi-Objective Software Effort Estimation: A Replication StudyIEEE Transactions on Software Engineering10.1109/TSE.2021.308336048:8(3185-3205)Online publication date: 1-Aug-2022
  • (2021)Software Project Management Using Machine Learning Technique—A ReviewApplied Sciences10.3390/app1111518311:11(5183)Online publication date: 2-Jun-2021
  • (2021)Influence of Outliers on Estimation Accuracy of Software Development EffortIEICE Transactions on Information and Systems10.1587/transinf.2020MPP0005E104.D:1(91-105)Online publication date: 1-Jan-2021
  • (2021)Comparative study of random search hyper-parameter tuning for software effort estimationProceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering10.1145/3475960.3475986(21-29)Online publication date: 19-Aug-2021
  • (2021)Hyper-Parameter Tuning of Classification and Regression Trees for Software Effort EstimationTrends and Applications in Information Systems and Technologies10.1007/978-3-030-72660-7_56(589-598)Online publication date: 29-Mar-2021
  • (2020)Evaluating hyper-parameter tuning using random search in support vector machines for software effort estimationProceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering10.1145/3416508.3417121(31-40)Online publication date: 8-Nov-2020
  • (2020)Software Project Management Using Machine Learning Technique - A Review2020 8th International Conference on Information Technology and Multimedia (ICIMU)10.1109/ICIMU49871.2020.9243543(363-370)Online publication date: 24-Aug-2020
  • (2019)Software Effort Interval Prediction via Bayesian Inference and Synthetic Bootstrap ResamplingACM Transactions on Software Engineering and Methodology10.1145/329570028:1(1-46)Online publication date: 9-Jan-2019
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media