Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3416508.3417118acmconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
research-article

An exploratory study on applicability of cross project defect prediction approaches to cross-company effort estimation

Published: 08 November 2020 Publication History

Abstract

BACKGROUND: Research on software effort estimation has been active for decades, especially in developing effort estimation models. Effort estimation models need a dataset collected from completed projects similar to a project to be estimated. The similarity suffers from dataset shift, and cross-company software effort estimation (CCSEE) gets an attractive research topic. A recent study on the dataset shift problem examined the applicability and the effectiveness of cross-project defect prediction (CPDP) approaches. It was insufficient to bring a conclusion due to a limited number of examined approaches. AIMS: To investigate the characteristics of CPDP approaches that are applicable and effective for dataset shift problem in effort estimation. METHOD: We first reviewed the characteristics of 24 CPDP approaches to find applicable approaches. Next, we investigated their effectiveness in effort estimation performance with ten dataset configurations. RESULTS: 16 out of 24 CPDP approaches implemented in CrossPare framework were found to be applicable to CCSEE. However, only one approach could improve the effort estimation performance. Most of the others degraded it and were harmful. CONCLUSIONS: Most of the CPDP approaches we examined were helpless for CCSEE.

References

[1]
Sousuke Amasaki, Kazuya Kawata, and Tomoyuki Yokogawa. 2015. Improving Cross-Project Defect Prediction Methods with Data Simplification. In Proc. of SEAA '15. IEEE, 96-103.
[2]
Sousuke Amasaki, Tomoyuki Yokogawa, and Hirohisa Aman. 2019. Applying Cross Project Defect Prediction Approaches to Cross-Company Efort Estimation. In Proc. of PROMISE '19. ACM, 76-79.
[3]
L. C. Briand, T. Langley, and I. Wieczorek. 2000. A replicated assessment and comparison of common software cost modeling techniques. In Proc. of ICSE. IEEE, 377-386.
[4]
Gerardo Canfora, Andrea De Lucia, Massimiliano Di Penta, Rocco Oliveto, Annibale Panichella, and Sebastiano Panichella. 2013. Multi-objective Cross-Project Defect Prediction. In Proc. of ICST '13. IEEE, 252-261.
[5]
Janez Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7 ( 2006 ), 1-30.
[6]
Camargo-Cruz Ana Erika and Koichiro Ochimizu. 2009. Towards logistic regression models for predicting fault-prone code across software projects. In Proc. of ESEM '09. IEEE, 460-463.
[7]
Filomena Ferrucci and Carmine Gravino. 2019. Can Expert Opinion Improve Efort Predictions When Exploiting Cross-Company Datasets ?-A Case Study in a Small/Medium Company. In Proc. of Product-Focused Software Process Improvement. Springer, 280-295.
[8]
Peng He, Bing Li, Xiao Liu, Jun Chen, and Yutao Ma. 2015. An empirical study on software defect prediction with a simplified metric set. Information and Software Technology 59 ( 2015 ), 170-190.
[9]
Zhimin He, F Peters, Tim Menzies, and Ye Yang. 2013. Learning from OpenSource Projects: An Empirical Study on Defect Prediction. In Proc. of ESEM '13. IEEE, 45-54.
[10]
Stefen Herbold. 2013. Training data selection for cross-project defect prediction. In Proc. of PROMISE '13. ACM, New York, New York, USA, 6 : 1-6 : 10.
[11]
Stefen Herbold, Alexander Trautsch, and Jens Grabowski. 2018. A Comparative Study to Benchmark Cross-Project Defect Prediction Approaches. IEEE Transactions on Software Engineering 44, 9 ( 2018 ), 811-833.
[12]
Mohamed Hosni, Ali Idri, Alain Abran, and Ali Bou Nassif. 2018. On the value of parameter tuning in heterogeneous ensembles efort estimation. Soft Computing 22, 18 ( 2018 ), 5977-6010.
[13]
LiGuo Huang, Daniel Port, Liang Wang, Tao Xie, and Tim Menzies. 2010. Text Mining in Supporting Software Systems Risk Assurance. In Proc. of International Conference on Automated Software Engineering (ASE '10). ACM, 163-166.
[14]
X. Jing, F. Qi, F. Wu, and B. Xu. 2016. Missing Data Imputation Based on LowRank Recovery and Semi-Supervised Regression for Software Efort Estimation. In Proc. of ICSE. IEEE, 607-618.
[15]
Xiaoyuan Jing, Fei Wu, Xiwei Dong, Fumin Qi, and Baowen Xu. 2015. Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning. In Proc. of FSE '15. ACM, 496-507.
[16]
Kazuya Kawata, Sousuke Amasaki, and Tomoyuki Yokogawa. 2015. Improving relevancy filter methods for cross-project defect prediction. In Proc. of ACIT-CSI '15. Springer, 2-7.
[17]
Taghi M. Khoshgoftaar, Pierre Rebours, and Naeem Seliya. 2009. Software quality analysis by combining multiple projects and learners. Software Quality Journal 17, 1 ( 2009 ), 25-49.
[18]
Barbara A. Kitchenham, Emilia Mendes, and Guilherme Horta Travassos. 2007. Cross versus Within-Company Cost Estimation Studies: A Systematic Review. IEEE Transactions on Software Engineering 33, 5 ( 2007 ), 316-329.
[19]
Ekrem Kocaguneli, Bojan Cukic, Tim Menzies, and Huihua Lu. 2013. Building a second opinion: learning cross-company data. In Proc. of ESEM '13. ACM, 1-10.
[20]
Ekrem Kocaguneli and Tim Menzies. 2011. How to Find Relevant Data for Efort Estimation?. In Proc. of ESEM '11. IEEE, 255-264.
[21]
Ekrem Kocaguneli, Tim Menzies, Jacky Keung, David Cok, and Ray Madachy. 2013. Active learning and efort estimation: Finding the essential content of software efort estimation data. IEEE Transactions on Software Engineering 39, 8 ( 2013 ), 1040-1053.
[22]
Ekrem Kocaguneli, Tim Menzies, and Emilia Mendes. 2015. Transfer learning in efort estimation. Empirical Software Engineering 20, 3 ( 2015 ), 813-843.
[23]
William B. Langdon, Javier Dolado, Federica Sarro, and Mark Harman. 2016. Exact Mean Absolute Error of Baseline Predictor, MARP0. Information and Software Technology 73 ( 2016 ), 16-18.
[24]
Yi Liu, Taghi M. Khoshgoftaar, and Naeem Seliya. 2010. Evolutionary Optimization of Software Quality Modeling with Multiple Repositories. IEEE Transactions on Software Engineering 36, 6 ( 2010 ), 852-864.
[25]
Ying Ma, Guangchun Luo, Xue Zeng, and Aiguo Chen. 2012. Transfer learning for cross-company software defect prediction. Information and Software Technology 54, 3 ( 2012 ), 248-256.
[26]
Emilia Mendes and Chris Lokan. 2009. Investigating the use of chronological splitting to compare software cross-company and single-company efort predictions: a replicated study. In Proc. of EASE. ACM, 11-20.
[27]
Solomon Mensah, Jacky Keung, Michael Franklin Bosu, and Kwabena Ebo Bennin. 2018. Duplex output software efort estimation model with self-guided interpretation. Information and Software Technology 94 ( 2018 ), 1-13.
[28]
Tim Menzies, A. Butcher, A. Marcus, Thomas Zimmermann, and D. Cok. 2011. Local versus global models for efort estimation and defect prediction. In Proc. of ASE '11. IEEE, 343-351.
[29]
Tim Menzies, Zhihao Chen, Jairus Hihn, and Karen Lum. 2006. Selecting best practices for efort estimation. IEEE Transactions on Software Engineering 32, 11 ( 2006 ), 883-895.
[30]
Leandro L Minku, David Bowes, Emad Shihab, and Burak Turhan. 2019. A novel online supervised hyperparameter tuning procedure applied to cross-company software efort estimation. Empirical Software Engineering 24 ( 2019 ), 3153-3204.
[31]
Leandro L. Minku and Siqing Hou. 2017. Clustering Dycom. In Proc. of PROMISE '17. ACM, 12-21.
[32]
Leandro L Minku and Xin Yao. 2012. Can Cross-company Data Improve Performance in Software Efort Estimation?. In Proc. of PROMISE '12. ACM, 69-78.
[33]
Leandro L Minku and Xin Yao. 2014. How to make best use of cross-company data in software efort estimation?. In Proc. of ICSE. ACM, 446-456.
[34]
Leandro L Minku and Xin Yao. 2017. Which models of the past are relevant to the present? A software efort estimation approach to exploiting useful past models. Automated Software Engineering 24, 3 ( 2017 ), 499-542.
[35]
Jaechang Nam and Sunghun Kim. 2015. CLAMI: Defect Prediction on Unlabeled Datasets (T). In Proc. of ASE '15. IEEE, 452-463.
[36]
Jaechang Nam, S J Pan, and Sunghun Kim. 2013. Transfer defect learning. In Proc. of ICSE '13. IEEE, 382-391.
[37]
Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 ( 2010 ), 1345-1359.
[38]
A Panichella, Rocco Oliveto, and Andrea De Lucia. 2014. Cross-project defect prediction models: L'Union fait la force. In Proc. of CSMR-WCRE '14. IEEE, 164-173.
[39]
Fayola Peters and Tim Menzies. 2012. Privacy and utility for defect prediction: experiments with MORPH. In Proc. of ICSE '12. IEEE, 189-199.
[40]
Fayola Peters, Tim Menzies, Liang Gong, and Hongyu Zhang. 2013. Balancing Privacy and Utility in Cross-Company Defect Prediction. IEEE Transactions on Software Engineering 39, 8 ( 2013 ), 1054-1068.
[41]
Fayola Peters, Tim Menzies, and Lucas Layman. 2015. LACE2: Better PrivacyPreserving Data Sharing for Cross Project Defect Prediction. In Proc. of ICSE '15. IEEE, 801-811.
[42]
Passakorn Phannachitta, Jacky Keung, Akito Monden, and Kenichi Matsumoto. 2017. A stability assessment of solution adaptation techniques for analogy-based software efort estimation. Empirical Software Engineering 22, 1 ( 2017 ), 474-504.
[43]
Przemyslaw Pospieszny, Beata Czarnacka-Chrobot, and Andrzej Kobylinski. 2018. An efective approach for software project efort and duration estimation with machine learning algorithms. The Journal of Systems & Software 137 ( 2018 ), 184-196.
[44]
Duksan Ryu, Okjoo Choi, and Jongmoon Baik. 2014. Value-cognitive boosting with a support vector machine for cross-project defect prediction. Empirical Software Engineering 21, 1 ( 2014 ), 1-29.
[45]
Duksan Ryu, Jong-In Jang, and Jongmoon Baik. 2015. A hybrid instance selection using nearest-neighbor for cross-project defect prediction. Journal of Computer Science and Technology 30, 5 ( 2015 ), 969-980.
[46]
Federica Sarro, Alessio Petrozziello, and Mark Harman. 2016. Multi-objective software efort estimation. In Proc. of ICSE. ACM, 619-630.
[47]
Sumeet Kaur Sehra, Yadwinder Singh Brar, Navdeep Kaur, and Sukhjit Singh Sehra. 2017. Research patterns and trends in software efort estimation. Information and Software Technology 91 ( 2017 ), 1-21.
[48]
Martin Shepperd and Chris Schofield. 1997. Estimating software project efort using analogies. IEEE Transactions on Software Engineering 23, 11 ( 1997 ), 736-743.
[49]
Martin J Shepperd and Steve MacDonell. 2012. Evaluating prediction systems in software project estimation. Information and Software Technology 54, 8 ( 2012 ), 820-827.
[50]
Boyce Sigweni, Martin Shepperd, and Tommaso Turchi. 2016. Realistic assessment of software efort estimation models. In Proc. of EASE '16. ACM, 6.
[51]
Shensi Tong, Qing He, Yuting Chen, Ye Yang, and Beijun Shen. 2016. Heterogeneous Cross-Company Efort Estimation through Transfer Learning. In Proc. of APSEC '16. IEEE, 169-176.
[52]
Burak Turhan. 2012. On the dataset shift problem in software engineering prediction models. Empirical Software Engineering 17, 1-2 ( 2012 ), 62-74.
[53]
Burak Turhan and Emilia Mendes. 2014. A Comparison of Cross-Versus SingleCompany Efort Prediction Models for Web Projects. In Proc. of SEAA '14. IEEE, 285-292.
[54]
Burak Turhan, Tim Menzies, Ayşe B Bener, and Justin Di Stefano. 2009. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14, 5 ( 2009 ), 540-578.
[55]
Satoshi Uchigaki, Shinji Uchida, Koji Toda, and Akito Monden. 2012. An Ensemble Approach of Simple Regression Models to Cross-Project Fault Prediction. In Proc. of SNPD '12. IEEE, 476-481.
[56]
Shinya Watanabe, Haruhiko Kaiya, and Kenji Kaijiri. 2008. Adapting a fault prediction model to allow inter languagereuse. In Proc. of PROMISE '08. ACM, New York, New York, USA, 19-24.
[57]
Yun Zhang, David Lo, Xin Xia, and Jianling Sun. 2015. An Empirical Study of Classifier Combination for Cross-Project Defect Prediction. In Proc. of COMPSAC '15. IEEE, 264-269.
[58]
Yuming Zhou, Yibiao Yang, Hongmin Lu, Lin Chen, Yanhui Li, Yangyang Zhao, Junyan Qian, and Baowen Xu. 2018. How Far We Have Progressed in the Journey? An Examination of Cross-Project Defect Prediction. ACM Transactions on Software Engineering and Methodology 27, 1 ( 2018 ), 1-51.
[59]
Thomas Zimmermann, Nachiappan Nagappan, Harald Gall, Emanuel Giger, and Brendan Murphy. 2009. Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In Proc. of ESEC/FSE '09. ACM, 91-100.

Cited By

View all
  • (2022)An extended study on applicability and performance of homogeneous cross-project defect prediction approaches under homogeneous cross-company effort estimation situationEmpirical Software Engineering10.1007/s10664-021-10103-427:2Online publication date: 1-Mar-2022

Index Terms

  1. An exploratory study on applicability of cross project defect prediction approaches to cross-company effort estimation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering
      November 2020
      80 pages
      ISBN:9781450381277
      DOI:10.1145/3416508
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 November 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cross-company effort estimation
      2. cross-project defect prediction
      3. empirical evaluation

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      PROMISE '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 98 of 213 submissions, 46%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 04 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)An extended study on applicability and performance of homogeneous cross-project defect prediction approaches under homogeneous cross-company effort estimation situationEmpirical Software Engineering10.1007/s10664-021-10103-427:2Online publication date: 1-Mar-2022

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media