Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2820518.2820522acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Code ownership and software quality: a replication study

Published: 16 May 2015 Publication History

Abstract

In a traditional sense, ownership determines rights and duties in regard to an object, for example a property. The owner of source code usually refers to the person that invented the code. However, larger code artifacts, such as files, are usually composed by multiple engineers contributing to the entity over time through a series of changes. Frequently, the person with the highest contribution, e.g. the most number of code changes, is defined as the code owner and takes responsibility for it. Thus, code ownership relates to the knowledge engineers have about code. Lacking responsibility and knowledge about code can reduce code quality. In an earlier study, Bird et al. [1] showed that Windows binaries that lacked clear code ownership were more likely to be defect prone. However recommendations for large artifacts such as binaries are usually not actionable. E.g. changing the concept of binaries and refactoring them to ensure strong ownership would violate system architecture principles. A recent replication study by Foucault et al. [2] on open source software replicate the original results and lead to doubts about the general concept of ownership impacting code quality. In this paper, we replicated and extended the previous two ownership studies [1, 2] and reflect on their findings. Further, we define several new ownership metrics to investigate the dependency between ownership and code quality on file and directory level for 4 major Microsoft products. The results confirm the original findings by Bird et al. [1] that code ownership correlates with code quality. Using new and refined code ownership metrics we were able to classify source files that contained at least one bug with a median precision of 0.74 and a median recall of 0.38. On directory level, we achieve a precision of 0.76 and a recall of 0.60.

References

[1]
C. Bird, N. Nagappan, B. Murphy, H. Gall and P. Devanbu, "Don'T Touch My Code!: Examining the Effects of Ownership on Software Quality," in Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011.
[2]
M. Foucault, J.-R. Falleri and X. Blanc, "Code ownership in open-source software," in Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, New York, 2014.
[3]
C. Bird, N. Nagappan, P. Devanbu, H. Gall and B. Murphy, "Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista," in Proceedings of the 31st International Conference on Software Engineering, 2009.
[4]
N. Nagappan, B. Murphy and V. Basili, "The Influence of Organizational Structure on Software Quality: An Empirical Case Study," in Proceedings of the 30th International Conference on Software Engineering, 2008.
[5]
K. Herzig and N. Nagappan, "The Impact of Test Ownership and Team Structure on the Reliability and Effectiveness of Quality Test Runs," in Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2014.
[6]
F. Rahman and P. Devanbu, "Ownership, Experience and Defects: A Fine-grained Study of Authorship," in Proceedings of the 33rd International Conference on Software Engineering, 2011.
[7]
A. Meneely and L. Williams, "Secure Open Source Collaboration: An Empirical Study of Linus' Law," in Proceedings of the 16th ACM Conference on Computer and Communications Security, 2009.
[8]
J. Czerwonka, N. Nagappa, W. Schulte and B. Murphy, "CODEMINE: Building a Software Development Data Analytics Platform for Microsoft," IEEE Software, pp. 64--71, 2013.
[9]
L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5--32.
[10]
M. Kuhn, "caret: Classification and Regression Training," 2011.
[11]
T. D. LaToza, G. Venolia and R. DeLine, "Maintaining mental models: a study of developer work habits," in International Conference on Software engineering, New York, 2006.
[12]
J. Cohen, P. Cohen, S. West and L. Aiken, Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, Routledge, 2002.
[13]
A. Mockus, "Succession: Measuring transfer of code and developer productivity," in Proceedings of the 31st International Conference on Software Engineering, Vancouver, 2009.
[14]
M. E. Nordberg III, "Managing Code Ownership," IEEE Softw., pp. 26--33, 2003.
[15]
D. Kawrykow and M. P. Robillard, "Non-essential Changes in Version Histories," in Proceedings of the 33rd International Conference on Software Engineering, 2011.
[16]
K. Herzig and A. Zeller, "The Impact of Tangled Code Changes," in Proceedings of the 10th Working Conference on Mining Software Repositories, 2013.
[17]
J. Maxwell, "The value of a realist understanding of causality for qualitative research," in Qualitative inquiry and the politics of evidence, Walnut Creek, Left Coast Press, 2008, pp. 163--181.
[18]
D. Posnett, V. Filkov and P. Devanbu, "Ecological inference in empirical software engineering," in Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering, Washington, DC, USA, 2011.
[19]
E. Weyuker, T. Ostrand and R. Bell, "Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models," Empirical Software Engineering, vol. 13, no. 5, pp. 539--559, 2008.
[20]
N. Fenton and M. Neil, "A critique of software defect prediction models," Software Engineering, IEEE Transactions on, vol. 25, pp. 675--689, Sep 1999.
[21]
C. Catal and B. Diri, "Review: A Systematic Review of Software Fault Prediction Studies," Expert Syst. Appl., vol. 36, pp. 7346--7354, may 2009.
[22]
D. Radjenoviç, M. Heričko, R. Torkar and A. Živkovič, "Software fault prediction metrics: A systematic literature review," Information and Software Technology, vol. 55, pp. 1397--1418, 2013.
[23]
T. J. Ostrand, E. J. Weyuker and R. M. Bell, "Where the bugs are," in Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis, 2004.
[24]
R. Moser, W. Pedrycz and G. Succi, "A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction," in Proceedings of the 30th international conference on Software engineering, 2008.
[25]
E. Shihab, A. Hassan, B. Adams and Z. M. Jiang, "An industrial study on the risk of software changes," in Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, Cary, North Carolina, 2012.
[26]
M. Pinzger, N. Nagappan and B. Murphy, "Can developer-module networks predict failures?," in Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, 2008.
[27]
A. E. Hassan, "Predicting faults using the complexity of code changes," in Proceedings of the 31st International Conference on Software Engineering, 2009.
[28]
T. Zimmermann and N. Nagappan, "Predicting defects using network analysis on dependency graphs," in Proceedings of the 30th international conference on Software engineering, 2008.
[29]
K. Herzig, S. Just, A. Rau and A. Zeller, "Predicting Defects Using Change Genealogies," in Proceedings of the 2013 IEEE 24nd International Symposium on Software Reliability Engineering, 2013.
[30]
N. Nagappan, L. Williams, M. Vouk and J. Osborne, "Early Estimation of Software Quality Using In-process Testing Metrics: A Controlled Case Study," SIGSOFT Softw. Eng. Notes, vol. 30, pp. 1--7, may 2005.
[31]
K. Herzig, "Using Pre-Release Test Failures to Build Early Post-Release Defect Prediction Models," in Proceedings of the 25th International Symposium on Software Reliability Engineering, 2014.
[32]
I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann, 2005.

Cited By

View all
  • (2020)The Evolving Nature of Developers' Contributions in Open Source ProjectsProceedings of the 14th Brazilian Symposium on Software Components, Architectures, and Reuse10.1145/3425269.3425284(131-140)Online publication date: 19-Oct-2020
  • (2019)git2netProceedings of the 16th International Conference on Mining Software Repositories10.1109/MSR.2019.00070(433-444)Online publication date: 26-May-2019
  • (2019)Empirical study in using version histories for change risk classificationProceedings of the 16th International Conference on Mining Software Repositories10.1109/MSR.2019.00018(58-62)Online publication date: 26-May-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MSR '15: Proceedings of the 12th Working Conference on Mining Software Repositories
May 2015
542 pages
ISBN:9780769555942

Sponsors

Publisher

IEEE Press

Publication History

Published: 16 May 2015

Check for updates

Author Tags

  1. code ownership
  2. empirical software engineering
  3. software quality

Qualifiers

  • Research-article

Conference

ICSE '15
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2020)The Evolving Nature of Developers' Contributions in Open Source ProjectsProceedings of the 14th Brazilian Symposium on Software Components, Architectures, and Reuse10.1145/3425269.3425284(131-140)Online publication date: 19-Oct-2020
  • (2019)git2netProceedings of the 16th International Conference on Mining Software Repositories10.1109/MSR.2019.00070(433-444)Online publication date: 26-May-2019
  • (2019)Empirical study in using version histories for change risk classificationProceedings of the 16th International Conference on Mining Software Repositories10.1109/MSR.2019.00018(58-62)Online publication date: 26-May-2019
  • (2019)Attitudes, beliefs, and development data concerning agile software development practicesProceedings of the 41st International Conference on Software Engineering: Software Engineering Education and Training10.1109/ICSE-SEET.2019.00025(158-169)Online publication date: 27-May-2019
  • (2018)Mining file histories: should we consider branches?Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering10.1145/3238147.3238169(202-213)Online publication date: 3-Sep-2018
  • (2018)Roles and impacts of hands-on software architects in five industrial case studiesProceedings of the 40th International Conference on Software Engineering10.1145/3180155.3180234(117-127)Online publication date: 27-May-2018
  • (2017)Revisiting Assert Use in GitHub ProjectsProceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering10.1145/3084226.3084259(298-307)Online publication date: 15-Jun-2017
  • (2016)Advantages and Disadvantages of using Shared code from the Developers PerspectiveProceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/2961111.2962624(1-6)Online publication date: 8-Sep-2016
  • (2016)Revisiting code ownership and its relationship with software quality in the scope of modern code reviewProceedings of the 38th International Conference on Software Engineering10.1145/2884781.2884852(1039-1050)Online publication date: 14-May-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media