research-article

Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows

Authors:

Thomas Zimmermann,

Nachiappan Nagappan,

Brendan MurphyAuthors Info & Claims

ICSE '10: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1

Pages 495 - 504

https://doi.org/10.1145/1806799.1806871

Published: 01 May 2010 Publication History

Abstract

We performed an empirical study to characterize factors that affect which bugs get fixed in Windows Vista and Windows 7, focusing on factors related to bug report edits and relationships between people involved in handling the bug. We found that bugs reported by people with better reputations were more likely to get fixed, as were bugs handled by people on the same team and working in geographical proximity. We reinforce these quantitative results with survey feedback from 358 Microsoft employees who were involved in Windows bugs. Survey respondents also mentioned additional qualitative influences on bug fixing, such as the importance of seniority and interpersonal skills of the bug reporter.

Informed by these findings, we built a statistical model to predict the probability that a new bug will be fixed (the first known one, to the best of our knowledge). We trained it on Windows Vista bugs and got a precision of 68% and recall of 64% when predicting Windows 7 bug fixes. Engineers could use such a model to prioritize bugs during triage, to estimate developer workloads, and to decide which bugs should be closed or migrated to future product versions.

References

[1]

P. Anbalagan and M. Vouk. On predicting the time taken to correct bugs in open source projects (short paper). In ICSM '09: Proceedings of the 25th IEEE International Conference on Software Maintenance, September 2009.

[2]

G. Antoniol, K. Ayari, M. Di Penta, F. Khomh, and Y.-G. Guéhéneuc. Is it a bug or an enhancement?: a text-based approach to classify change requests. In CASCON '08: Proceedings of the 2008 conference of the center for advanced studies on collaborative research, pages 304--318, 2008.

Digital Library

[3]

J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In ICSE '06: Proceedings of the 28th International Conference on Software Engineering, pages 361--370, 2006.

Digital Library

[4]

J. Anvik and G. Murphy. Reducing the effort of bug report triage: Recommenders for development-oriented decisions. ACM Transactions on Software Engineering and Methodology (TOSEM).

Digital Library

[5]

J. Aranda and G. Venolia. The secret life of bugs: Going past the errors and omissions in software repositories. In ICSE' 09: Proceedings of the 31st International Conference on Software Engineering, 2009.

Digital Library

[6]

V. R. Basili, F. Shull, and F. Lanubile. Building knowledge through families of experiments. IEEE Trans. Softw. Eng., 25(4), 1999.

Digital Library

[7]

N. Bettenburg, S. Just, A. Schröter, C. Weiss, R. Premraj, and T. Zimmermann. What makes a good bug report? In FSE '08: Proceedings of the 16th International Symposium on Foundations of Software Engineering, November 2008.

Digital Library

[8]

N. Bettenburg, R. Premraj, T. Zimmermann, and S. Kim. Duplicate bug reports considered harmful... really? In ICSM '08: Proceedings of the 24th IEEE International Conference on Software Maintenance, pages 337--345, September 2008.

[9]

C. Bird, A. Bachmann, E. Aune, J. Duffy, A. Bernstein, V. Filkov, and P. Devanbu. Fair and balanced? bias in bug-fix datasets. In ESEC-FSE '09: Proceedings of the European Software Engineering Conference and ACM SIGSOFT Symposium on Foundations of Software Engineering, 2009.

Digital Library

[10]

C. Bird, N. Nagappan, P. Devanbu, H. Gall, and B. Murphy. Does distributed development affect software quality? an empirical case study of windows vista. In ICSE '09: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering, pages 518--528, Washington, DC, USA, 2009. IEEE Computer Society.

Digital Library

[11]

S. Breu, R. Premraj, J. Sillito, and T. Zimmermann. Investigating information needs to improve cooperation between developers and bug reporters. In CSCW '10: Proceedings of the ACM Conference on Computer Supported Cooperative Work, February 2010.

Digital Library

[12]

G. Canfora and L. Cerulo. Fine grained indexing of software repositories to support impact analysis. In MSR '06: Proceedings of the International Workshop on Mining Software Repositories, pages 105--111, 2006.

Digital Library

[13]

G. Canfora and L. Cerulo. Supporting change request assignment in open source development. In SAC '06: Proceedings of the 2006 ACM Symposium on Applied Computing, pages 1767--1772, 2006.

Digital Library

[14]

M. Cataldo, J. D. Herbsleb, and K. M. Carley. Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity. In ESEM '08: Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pages 2--11. ACM, 2008.

Digital Library

[15]

CNET News.com Staff. Microsoft tests its own 'dog food'. http://news.zdnet.com/2100-3513_22-130518.html, 2003.

[16]

S. Dowdy, S. Weardon, and D. Chilko. Statistics for Research, volume 1345 of Wiley Series in Probability and Statistics. John Wiley & Sons, New Jersey, 2004.

[17]

P. J. Guo and D. Engler. Linux kernel developer responses to static analysis bug reports. In USENIX ATC '09: Proceedings of the 2009 USENIX Annual Technical Conference, pages 285--292, June 2009.

Digital Library

[18]

B. Hailpern and P. Santhanam. Software debugging, testing, and verification. IBM Systems Journal, 41(1):4--12, 2002.

Digital Library

[19]

J. D. Herbsleb and R. E. Grinter. Architectures, coordination, and distance: Conway's law and beyond. IEEE Softw., 16(5):63--70, 1999.

Digital Library

[20]

J. D. Herbsleb and A. Mockus. An empirical study of speed and communication in globally distributed software development. IEEE Trans. Software Eng., 29(6):481--494, 2003.

Digital Library

[21]

I. Herraiz, D. M. German, J. M. Gonzalez-Barahona, and G. Robles. Towards a simplification of the bug report form in Eclipse. In MSR '08: Proceedings of the 2008 international working conference on Mining software repositories, pages 145--148. ACM, 2008.

Digital Library

[22]

L. Hiew. Assisted detection of duplicate bug reports. Master's thesis, The University of British Columbia, 2006.

[23]

P. Hooimeijer and W. Weimer. Modeling bug report quality. In ASE '07: Proceedings of the twenty-second IEEE/ACM International Conference on Automated Software Engineering, pages 34--43, 2007.

Digital Library

[24]

D. W. Hosmer and S. Lemeshow. Applied Logistic Regression. John Wiley & Sons, 2nd edition, 2000.

[25]

N. Jalbert and W. Weimer. Automated duplicate detection for bug tracking systems. In DSN '08: Proceedings of the Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pages 52--61, 2008.

[26]

G. Jeong, S. Kim, and T. Zimmermann. Improving bug triage with bug tossing graphs. In ESEC-FSE '09: Proceedings of the European Software Engineering Conference and ACM SIGSOFT Symposium on Foundations of Software Engineering, 2009.

Digital Library

[27]

S. Just, R. Premraj, and T. Zimmermann. Towards the next generation of bug tracking systems. In VL/HCC '08: Proceedings of the 2008 IEEE Symposium on Visual Languages and Human-Centric Computing, pages 82--85, September 2008.

Digital Library

[28]

S. Kim and M. D. Ernst. Which warnings should I fix first? In ESEC-FSE '07: Proc. 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 45--54. ACM, 2007.

Digital Library

[29]

A. J. Ko, B. A. Myers, and D. H. Chau. A linguistic analysis of how people describe software problems. In VL/HCC '06: Proceedings of the 2006 IEEE Symposium on Visual Languages and Human-Centric Computing, pages 127--134, 2006.

Digital Library

[30]

D. W. van Liere. How Shallow is a Bug? Open Source Communities as Information Repositories and Solving Software Defects. SSRN eLibrary, 2009. http://ssrn.com/paper=1507233.

[31]

D. W. van Liere. Improving Bugzilla's bug overview list by predicting which bug will get fixed, June 2009. http://network-labs.org/2009/06/improving-bugzilla's-bug-overview-list-%by-predicting-which-bug-will-get-fixed/.

[32]

T. Menzies and A. Marcus. Automated severity assessment of software defect reports. In ICSM '08: Proc. 24th IEEE International Conference on Software Maintenance, pages 346--355, Sept 2008.

[33]

A. Mockus, R. T. Fielding, and J. D. Herbsleb. Two case studies of open source software development: Apache and Mozilla. ACM Trans. Softw. Eng. Methodol., 11(3):309--346, 2002.

Digital Library

[34]

N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality: an empirical case study. In ICSE '08: Proceedings of the 30th international conference on Software engineering, pages 521--530, 2008.

Digital Library

[35]

J. Neter, M. H. Kutner, C. J. Nachtsheim, and W. Wasserman. Applied Linear Statistical Models. Irwin, 4th edition, 1996.

[36]

L. D. Panjer. Predicting Eclipse bug lifetimes. In MSR '07: Proceedings of the Fourth International Workshop on Mining Software Repositories, 2007. MSR Challenge Contribution.

Digital Library

[37]

P. Runeson, M. Alexandersson, and O. Nyholm. Detection of duplicate defect reports using natural language processing. In ICSE '07: Proceedings of the 29th International Conference on Software Engineering, pages 499--510, 2007.

Digital Library

[38]

J. R. Ruthruff, J. Penix, J. D. Morgenthaler, S. Elbaum, and G. Rothermel. Predicting accurate and actionable static analysis warnings: an experimental approach. In ICSE '08: Proceedings of the 30th international conference on Software engineering, pages 341--350. ACM, 2008.

Digital Library

[39]

E. Sink. My life as a code economist, November 2005. http://www.ericsink.com/articles/Four_Questions.html.

[40]

J. Śliwerski, T. Zimmermann, and A. Zeller. When do changes induce fixes? In MSR '05: Proceedings of the 2005 international workshop on Mining software repositories, pages 1--5. ACM, 2005.

Digital Library

[41]

X. Wang, L. Zhang, T. Xie, J. Anvik, and J. Sun. An approach to detecting duplicate bug reports using natural language and execution information. In ICSE '08: Proceedings of the 30th international conference on Software engineering, pages 461--470. ACM, 2008.

Digital Library

[42]

C. Weiss, R. Premraj, T. Zimmermann, and A. Zeller. How long will it take to fix this bug? In MSR '07: Proceedings of the Fourth International Workshop on Mining Software Repositories, 2007.

Digital Library

Cited By

Peralta SWashizaki HFukazawa YNoyori YNojiri SKanuka H(2024)Unraveling the Influences on Bug Fixing Time: A Comparative Analysis of Causal Inference ModelProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661186(393-398)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661186
Liang JBadea CBird CDeLine RFord DForsgren NZimmermann T(2024)Can GPT-4 Replicate Empirical Software Engineering Research?Proceedings of the ACM on Software Engineering10.1145/36607671:FSE(1330-1353)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660767
Vashisht RJuneja AThakral GGupta S(2024)An empirical study of just-in-time-defect prediction using various machine learning techniquesInternational Journal of Computers and Applications10.1080/1206212X.2024.2328489(1-10)Online publication date: 20-Mar-2024
https://doi.org/10.1080/1206212X.2024.2328489
Show More Cited By

Recommendations

Who should fix this bug?
ICSE '06: Proceedings of the 28th international conference on Software engineering

Open source development projects typically support an open bug repository to which both developers and users can report bugs. The reports that appear in this repository must be triaged to determine if the report is one which requires attention and if it ...
Modeling bug report quality
ASE '07: Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering

Software developers spend a significant portion of their resources handling user-submitted bug reports. For software that is widely deployed, the number of bug reports typically outstrips the resources available to triage them. As a result, some reports ...
Improving bug triage with bug tossing graphs
ESEC/FSE '09: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering

bug report is typically assigned to a single developer who is then responsible for fixing the bug. In Mozilla and Eclipse, between 37%-44% of bug reports are "tossed" (reassigned) to other developers, for example because the bug has been assigned by ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '10: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1

May 2010

627 pages

ISBN:9781605587196

DOI:10.1145/1806799

General Chairs:
Jeff Kramer
Imperial College, London, UK
,
Judith Bishop
Microsoft Research, Redmond
,
Program Chairs:
Prem Devanbu
University of California at Davis
,
Sebastian Uchitel
University of Buenos Aires, Argentina and Imperial College London, UK

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ICSE '10

Sponsor:

SIGSOFT

ICSE '10: 32nd International Conference on Software Engineering

May 1 - 8, 2010

Cape Town, South Africa

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

210
Total Citations
View Citations
1,156
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)4

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Peralta SWashizaki HFukazawa YNoyori YNojiri SKanuka H(2024)Unraveling the Influences on Bug Fixing Time: A Comparative Analysis of Causal Inference ModelProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661186(393-398)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661186
Liang JBadea CBird CDeLine RFord DForsgren NZimmermann T(2024)Can GPT-4 Replicate Empirical Software Engineering Research?Proceedings of the ACM on Software Engineering10.1145/36607671:FSE(1330-1353)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660767
Vashisht RJuneja AThakral GGupta S(2024)An empirical study of just-in-time-defect prediction using various machine learning techniquesInternational Journal of Computers and Applications10.1080/1206212X.2024.2328489(1-10)Online publication date: 20-Mar-2024
https://doi.org/10.1080/1206212X.2024.2328489
Nguyen SNguyen TVu TDo TNgo KVo H(2024)Code-centric learning-based just-in-time vulnerability detectionJournal of Systems and Software10.1016/j.jss.2024.112014214(112014)Online publication date: Aug-2024
https://doi.org/10.1016/j.jss.2024.112014
Maes-Bermejo MSerebrenik AGallego MGortázar FRobles GGonzález Barahona J(2024)Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testingEmpirical Software Engineering10.1007/s10664-024-10479-z29:3Online publication date: 4-May-2024
https://doi.org/10.1007/s10664-024-10479-z
Kuramoto HWang DKondo MKashiwa YKamei YUbayashi N(2024)Understanding the characteristics and the role of visual issue reportsEmpirical Software Engineering10.1007/s10664-024-10459-329:4Online publication date: 10-Jun-2024
https://doi.org/10.1007/s10664-024-10459-3
Jing XChen HXu BJing XChen HXu B(2024)Other Research Questions of SDPIntelligent Software Defect Prediction10.1007/978-981-99-2842-2_7(171-201)Online publication date: 18-Jan-2024
https://doi.org/10.1007/978-981-99-2842-2_7
Miloudi CCheikhi LIdri AAbran A(2024)On the value of instance selection for bug resolution prediction performanceJournal of Software: Evolution and Process10.1002/smr.2710Online publication date: 2-Jul-2024
https://doi.org/10.1002/smr.2710
Yang XLiu JZhang D(2023)A Comprehensive Taxonomy for Prediction Models in Software EngineeringInformation10.3390/info1402011114:2(111)Online publication date: 10-Feb-2023
https://doi.org/10.3390/info14020111
Tiutiunnyk PRybachok N(2023)Analysis of Tasks Parameters of Solve the Problem of Determining Delays and Risks in Agile ProjectsControl Systems and Computers10.15407/csc.2023.02.061(61-66)Online publication date: 2023
https://doi.org/10.15407/csc.2023.02.061
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents