research-article

Revisiting unsupervised learning for defect prediction

Authors:

Tim MenziesAuthors Info & Claims

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

Pages 72 - 83

https://doi.org/10.1145/3106237.3106257

Published: 21 August 2017 Publication History

Abstract

Collecting quality data from software projects can be time-consuming and expensive. Hence, some researchers explore "unsupervised" approaches to quality prediction that does not require labelled data. An alternate technique is to use "supervised" approaches that learn models from project data labelled with, say, "defective" or "not-defective". Most researchers use these supervised models since, it is argued, they can exploit more knowledge of the projects.

At FSE-16, Yang et al. reported startling results where unsupervised defect predictors outperformed supervised predictors for effort-aware just-in-time defect prediction. If confirmed, these results would lead to a dramatic simplification of a seemingly complex task (data mining) that is widely explored in the software engineering literature.

This paper repeats and refutes those results as follows. (1) There is much variability in the efficacy of the Yang et al. predictors so even with their approach, some supervised data is required to prune weaker predictors away. (2) Their findings were grouped across N projects. When we repeat their analysis on a project-by-project basis, supervised predictors are seen to work better.

Even though this paper rejects the specific conclusions of Yang et al., we still endorse their general goal. In our our experiments, supervised predictors did not perform outstandingly better than unsupervised ones for effort-aware just-in-time defect prediction. Hence, they may indeed be some combination of unsupervised learners to achieve comparable performance to supervised ones. We therefore encourage others to work in this promising area.

References

[1]

Amritanshu Agrawal and Tim Menzies. 2017. "Better Data" is Better than "Better Data Miners" (Beneits of Tuning SMOTE for Defect Prediction). CoRR abs/1705.03697 (2017). http://arxiv.org/abs/1705.03697

[2]

Fumio Akiyama. 1971. An example of software system debugging. In IFIP Congress (1), Vol. 71. 353ś359.

[3]

Erik Arisholm and Lionel C Briand. 2006. Predicting fault-prone components in a java legacy system. In Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering. ACM, 8ś17.

Digital Library

[4]

Yoav Benjamini and Yosef Hochberg. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) (1995), 289ś300.

[5]

Shyam R Chidamber and Chris F Kemerer. 1994. A metrics suite for object oriented design. IEEE Transactions on Software Engineering 20, 6 (1994), 476ś493.

Digital Library

[6]

Marco D’Ambros, Michele Lanza, and Romain Robbes. 2010. An extensive comparison of bug prediction approaches. In 2010 7th IEEE Working Conference on Mining Software Repositories. IEEE, 31ś41.

[7]

Wei Fu, Tim Menzies, and Xipeng Shen. 2016. Tuning for software analytics: Is it really necessary? Information and Software Technology 76 (2016), 135ś146.

Digital Library

[8]

Wei Fu, Vivek Nair, and Tim Menzies. 2016. Why is diferential evolution better than grid search for tuning defect predictors? arXiv preprint arXiv:1609.02613 (2016).

[9]

Takafumi Fukushima, Yasutaka Kamei, Shane McIntosh, Kazuhiro Yamashita, and Naoyasu Ubayashi. 2014. An empirical study of just-in-time defect prediction using cross-project models. In Proceedings of the 11th Working Conference on Mining Software Repositories. ACM, 172ś181.

Digital Library

[10]

Todd L Graves, Alan F Karr, James S Marron, and Harvey Siy. 2000. Predicting fault incidence using software change history. IEEE Transactions on Software Engineering 26, 7 (2000), 653ś661.

Digital Library

[11]

Philip J Guo, Thomas Zimmermann, Nachiappan Nagappan, and Brendan Murphy. 2010. Characterizing and predicting which bugs get ixed: an empirical study of Microsoft Windows. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, Vol. 1. ACM, 495ś504.

Digital Library

[12]

Mark Hall, Eibe Frank, Geofrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H Witten. 2009. The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11, 1 (2009), 10ś18.

Digital Library

[13]

Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. 2012. A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering 38, 6 (2012), 1276ś1304.

Digital Library

[14]

Maurice Howard Halstead. 1977. Elements of software science. Vol. 7. Elsevier New York.

Digital Library

[15]

Maggie Hamill and Katerina Goseva-Popstojanova. 2009. Common trends in software fault and failure data. IEEE Transactions on Software Engineering 35, 4 (2009), 484ś496.

Digital Library

[16]

Ahmed E Hassan. 2009. Predicting faults using the complexity of code changes. In Proceedings of the 31st International Conference on Software Engineering. IEEE, 78ś88.

Digital Library

[17]

Qiao Huang, Xin Xia, and David Lo. 2017. Supervised vs unsupervised models: a holistic look at efort-aware just-in-time defect prediction. In 2017 IEEE International Conference on Software Maintenance and Evolution. IEEE.

[18]

Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized defect prediction. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering. IEEE, 279ś289.

Digital Library

[19]

Xiao-Yuan Jing, Shi Ying, Zhi-Wu Zhang, Shan-Shan Wu, and Jin Liu. 2014. Dictionary learning based software defect prediction. In Proceedings of the 36th International Conference on Software Engineering. ACM, 414ś423.

Digital Library

[20]

Dennis Kafura and Geereddy R. Reddy. 1987. The use of software complexity metrics in software maintenance. IEEE Transactions on Software Engineering 3 (1987), 335ś343.

Digital Library

[21]

Yasutaka Kamei, Emad Shihab, Bram Adams, Ahmed E Hassan, Audris Mockus, Anand Sinha, and Naoyasu Ubayashi. 2013. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering 39, 6 (2013), 757ś773.

Digital Library

[22]

Taghi M Khoshgoftaar and Edward B Allen. 2001. Modeling software quality with. Recent Advances in Reliability and Quality Engineering 2 (2001), 247.

[23]

Taghi M Khoshgoftaar and Naeem Seliya. 2003. Software quality classiication modeling using the SPRINT decision tree algorithm. International Journal on Artiicial Intelligence Tools 12, 03 (2003), 207ś225.

[24]

Taghi M Khoshgoftaar, Xiaojing Yuan, and Edward B Allen. 2000. Balancing misclassiication rates in classiication-tree models of software quality. Empirical Software Engineering 5, 4 (2000), 313ś330.

Digital Library

[25]

Sunghun Kim, E James Whitehead Jr, and Yi Zhang. 2008. Classifying software changes: Clean or buggy? IEEE Transactions on Software Engineering 34, 2 (2008), 181ś196.

Digital Library

[26]

Sunghun Kim, Thomas Zimmermann, E James Whitehead Jr, and Andreas Zeller. 2007. Predicting faults from cached history. In Proceedings of the 29th International Conference on Software Engineering. IEEE, 489ś498.

Digital Library

[27]

Ekrem Kocaguneli, Tim Menzies, Ayse Bener, and Jacky W Keung. 2012. Exploiting the essential assumptions of analogy-based efort estimation. IEEE Transactions on Software Engineering 38, 2 (2012), 425ś438.

Digital Library

[28]

A Güneş Koru, Dongsong Zhang, Khaled El Emam, and Hongfang Liu. 2009. An investigation into the functional form of the size-defect relationship for software modules. IEEE Transactions on Software Engineering 35, 2 (2009), 293ś304.

Digital Library

[29]

Taek Lee, Jaechang Nam, DongGyun Han, Sunghun Kim, and Hoh Peter In. 2011. Micro interaction metrics for defect prediction. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ACM, 311ś321.

Digital Library

[30]

Stefan Lessmann, Bart Baesens, Christophe Mues, and Swantje Pietsch. 2008. Benchmarking classiication models for software defect prediction: A proposed framework and novel indings. IEEE Transactions on Software Engineering 34, 4 (2008), 485ś496.

Digital Library

[31]

Jinping Liu, Yuming Zhou, Yibiao Yang, Hongmin Lu, and Baowen Xu. 2017. Code churn: A neglected metric in efort-aware Just-In-Time defect prediction. In Empirical Software Engineering and Measurement, 2017 ACM/IEEE International Symposium on. IEEE.

Digital Library

[32]

Shinsuke Matsumoto, Yasutaka Kamei, Akito Monden, Ken-ichi Matsumoto, and Masahide Nakamura. 2010. An analysis of developer metrics for fault prediction. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering. ACM, 18.

Digital Library

[33]

Thomas J McCabe. 1976. A complexity measure. IEEE Transactions on Software Engineering 4 (1976), 308ś320.

Digital Library

[34]

Thilo Mende and Rainer Koschke. 2010. Efort-aware defect prediction models. In 2010 14th European Conference on Software Maintenance and Reengineering. IEEE, 107ś116.

Digital Library

[35]

Tim Menzies, Jeremy Greenwald, and Art Frank. 2007. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering 33, 1 (2007).

Digital Library

[36]

Tim Menzies, Zach Milton, Burak Turhan, Bojan Cukic, Yue Jiang, and Ayşe Bener. 2010. Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering 17, 4 (2010), 375ś407.

Digital Library

[37]

Ayse Tosun Misirli, Ayse Bener, and Resat Kale. 2011. Ai-based software defect predictors: Applications and beneits in a case study. AI Magazine 32, 2 (2011), 57ś68.

[38]

Audris Mockus and David M Weiss. 2000. Predicting risk of software changes. Bell Labs Technical Journal 5, 2 (2000), 169ś180.

[39]

Akito Monden, Takuma Hayashi, Shoji Shinoda, Kumiko Shirai, Junichi Yoshida, Mike Barker, and Kenichi Matsumoto. 2013. Assessing the cost efectiveness of fault prediction in acceptance testing. IEEE Transactions on Software Engineering 39, 10 (2013), 1345ś1357.

Digital Library

[40]

Raimund Moser, Witold Pedrycz, and Giancarlo Succi. 2008. A comparative analysis of the eiciency of change metrics and static code attributes for defect prediction. In Proceedings of the 30th International Conference on Software Engineering. ACM, 181ś190.

Digital Library

[41]

Nachiappan Nagappan and Thomas Ball. 2005. to predict system defect density. In Proceedings of the 27th International Conference on Software Engineering. IEEE, 284ś292.

Digital Library

[42]

Nachiappan Nagappan, Thomas Ball, and Andreas Zeller. 2006. Mining metrics to predict component failures. In Proceedings of the 28th International Conference on Software Engineering. ACM, 452ś461.

Digital Library

[43]

Jaechang Nam, Sinno Jialin Pan, and Sunghun Kim. 2013. Transfer defect learning. In Proceedings of the 2013 International Conference on Software Engineering. IEEE, 382ś391.

Digital Library

[44]

Thomas J Ostrand, Elaine J Weyuker, and Robert M Bell. 2004. Where the bugs are. In ACM SIGSOFT Software Engineering Notes, Vol. 29. ACM, 86ś96.

Digital Library

[45]

Thomas J Ostrand, Elaine J Weyuker, and Robert M Bell. 2005. Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering 31, 4 (2005), 340ś355.

Digital Library

[46]

Corina S Pasareanu, Peter C Mehlitz, David H Bushnell, Karen Gundy-Burlet, Michael Lowry, Suzette Person, and Mark Pape. 2008. Combining unit-level symbolic execution and system-level concrete execution for testing NASA software. In Proceedings of the 2008 International Symposium on Software Testing and Analysis. ACM, 15ś26.

Digital Library

[47]

Foyzur Rahman, Sameer Khatri, Earl T Barr, and Premkumar Devanbu. 2014. Comparing static bug inders and statistical prediction. In Proceedings of the 36th International Conference on Software Engineering. ACM, 424ś434.

Digital Library

[48]

Jeanine Romano, Jefrey D Kromrey, Jesse Coraggio, Jef Skowronek, and Linda Devine. 2006. Exploring methods for evaluating group diferences on the NSSE and other surveys: Are the t-test and CohenâĂŹsd indices the most appropriate choices. In Annual Meeting of the Southern Association for Institutional Research.

[49]

Forrest Shull, Ioana Rus, and Victor Basili. 2001. Improving software inspections by using reading techniques. In Proceedings of the 23rd International Conference on Software Engineering. IEEE, 726ś727.

Digital Library

[50]

Burak Turhan, Tim Menzies, Ayşe B Bener, and Justin Di Stefano. 2009. On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering 14, 5 (2009), 540ś578. ESEC/FSE’17, September 4–8, 2017, Paderborn, Germany Wei Fu, Tim Menzies

Digital Library

[51]

Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In Proceedings of the 38th International Conference on Software Engineering. ACM, 297ś308.

Digital Library

[52]

Maurice V Wilkes. 1985. Memoirs ofa Computer Pioneer. Cambridge, Mass., London (1985).

Digital Library

[53]

David H Wolpert. 2002. The supervised learning no-free-lunch theorems. In Soft Computing and Industry. Springer, 25ś42.

[54]

Yibiao Yang, Yuming Zhou, Jinping Liu, Yangyang Zhao, Hongmin Lu, Lei Xu, Baowen Xu, and Hareton Leung. 2016. Efort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 157ś168.

Digital Library

[55]

Zuoning Yin, Ding Yuan, Yuanyuan Zhou, Shankar Pasupathy, and Lakshmi Bairavasundaram. 2011. How do ixes become bugs?. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ACM, 26ś36.

Digital Library

Cited By

Chen YChen H(2024)Landmark-Based Domain Adaptation and Selective Pseudo-Labeling for Heterogeneous Defect PredictionElectronics10.3390/electronics1302045613:2(456)Online publication date: 22-Jan-2024
https://doi.org/10.3390/electronics13020456
Huang TYu HFan GHuang ZWu C(2024)A code change‐oriented approach to just‐in‐time defect prediction with multiple input semantic fusionExpert Systems10.1111/exsy.13702Online publication date: 27-Aug-2024
https://doi.org/10.1111/exsy.13702
Alnagi EAzzeh M(2024)Just-in-Time Software Defect Prediction Techniques: A Survey2024 15th International Conference on Information and Communication Systems (ICICS)10.1109/ICICS63486.2024.10638276(1-6)Online publication date: 13-Aug-2024
https://doi.org/10.1109/ICICS63486.2024.10638276
Show More Cited By

Recommendations

Effort-Aware Tri-Training for Semi-supervised Just-in-Time Defect Prediction
Advances in Knowledge Discovery and Data Mining
Abstract
In recent years, just-in-time (JIT) defect prediction has gained considerable interest as it enables developers to identify risky changes at check-in time. Previous studies tried to conduct research from both supervised and unsupervised ...
Heterogeneous defect prediction
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering

Software defect prediction is one of the most active research areas in software engineering. We can build a prediction model with defect data collected from a software project and predict defects in the same project, i.e. within-project defect ...
An investigation on the feasibility of cross-project defect prediction

Software defect prediction helps to optimize testing resources allocation by identifying defect-prone modules prior to testing. Most existing models build their prediction capability based on a set of historical data, presumably from the same or similar ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

August 2017

1073 pages

ISBN:9781450351058

DOI:10.1145/3106237

General Chairs:
Eric Bodden
Paderborn University, Germany / Fraunhofer IEM, Germany
,
Wilhelm Schäfer
Paderborn University, Germany
,
Program Chairs:
Arie van Deursen
Delft University of Technology, Netherlands
,
Andrea Zisman
Open University, UK

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ESEC/FSE'17

Sponsor:

SIGSOFT

ESEC/FSE'17: Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering

September 4 - 8, 2017

Paderborn, Germany

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

86
Total Citations
View Citations
946
Total Downloads

Downloads (Last 12 months)34
Downloads (Last 6 weeks)5

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen YChen H(2024)Landmark-Based Domain Adaptation and Selective Pseudo-Labeling for Heterogeneous Defect PredictionElectronics10.3390/electronics1302045613:2(456)Online publication date: 22-Jan-2024
https://doi.org/10.3390/electronics13020456
Huang TYu HFan GHuang ZWu C(2024)A code change‐oriented approach to just‐in‐time defect prediction with multiple input semantic fusionExpert Systems10.1111/exsy.13702Online publication date: 27-Aug-2024
https://doi.org/10.1111/exsy.13702
Alnagi EAzzeh M(2024)Just-in-Time Software Defect Prediction Techniques: A Survey2024 15th International Conference on Information and Communication Systems (ICICS)10.1109/ICICS63486.2024.10638276(1-6)Online publication date: 13-Aug-2024
https://doi.org/10.1109/ICICS63486.2024.10638276
Vashisht RJuneja AThakral GGupta S(2024)An empirical study of just-in-time-defect prediction using various machine learning techniquesInternational Journal of Computers and Applications10.1080/1206212X.2024.2328489(1-10)Online publication date: 20-Mar-2024
https://doi.org/10.1080/1206212X.2024.2328489
Keshavarz HRodríguez-Pérez G(2024)JITGNNJournal of Systems and Software10.1016/j.jss.2024.111984210:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.jss.2024.111984
Shehab MKhreich WHamou-Lhadj ASedki I(2024)Commit-time defect prediction using one-class classificationJournal of Systems and Software10.1016/j.jss.2023.111914208:COnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111914
Yang PZhu LZhang YMa CLiu LYu XHu W(2024)On the relative value of clustering techniques for Unsupervised Effort-Aware Defect PredictionExpert Systems with Applications10.1016/j.eswa.2023.123041245(123041)Online publication date: Jul-2024
https://doi.org/10.1016/j.eswa.2023.123041
Li ZDu QZhang HJing XWu F(2024)An empirical study of data sampling techniques for just-in-time software defect predictionAutomated Software Engineering10.1007/s10515-024-00455-831:2Online publication date: 22-Jun-2024
https://doi.org/10.1007/s10515-024-00455-8
Vescan AGăceanu RŞerban C(2024)Exploring the impact of data preprocessing techniques on composite classifier algorithms in cross-project defect predictionAutomated Software Engineering10.1007/s10515-024-00454-931:2Online publication date: 6-Jun-2024
https://doi.org/10.1007/s10515-024-00454-9
Shen JLi ZLu YPan MLi X(2024)Mitigating the impact of mislabeled data on deep predictive models: an empirical study of learning with noise approaches in software engineering tasksAutomated Software Engineering10.1007/s10515-024-00435-y31:1Online publication date: 4-Apr-2024
https://doi.org/10.1007/s10515-024-00435-y
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents