Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3377811.3380920acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Scaling open source communities: an empirical study of the Linux kernel

Published: 01 October 2020 Publication History

Abstract

Large-scale open source communities, such as the Linux kernel, have gone through decades of development, substantially growing in scale and complexity. In the traditional workflow, maintainers serve as "gatekeepers" for the subsystems that they maintain. As the number of patches and authors significantly increases, maintainers come under considerable pressure, which may hinder the operation and even the sustainability of the community. A few subsystems have begun to use new workflows to address these issues. However, it is unclear to what extent these new workflows are successful, or how to apply them. Therefore, we conduct an empirical study on the multiple-committer model (MCM) that has provoked extensive discussion in the Linux kernel community. We explore the effect of the model on the i915 subsystem with respect to four dimensions: pressure, latency, complexity, and quality assurance. We find that after this model was adopted, the burden of the i915 maintainers was significantly reduced. Also, the model scales well to allow more committers. After analyzing the online documents and interviewing the maintainers of i915, we propose that overloaded subsystems which have trustworthy candidate committers are suitable for adopting the model. We further suggest that the success of the model is closely related to a series of measures for risk mitigation---sufficient precommit testing, strict review process, and the use of tools to simplify work and reduce errors. We employ a network analysis approach to locate candidate committers for the target subsystems and validate this approach and contextual success factors through email interviews with their maintainers. To the best of our knowledge, this is the first study focusing on how to scale open source communities. We expect that our study will help the rapidly growing Linux kernel and other similar communities to adapt to changes and remain sustainable.

References

[1]
Maria Antikainen, Timo Aaltonen, and Jaani Väisänen. 2007. The role of trust in OSS communities --- Case Linux Kernel community. In Open Source Development, Adoption and Innovation, Joseph Feller, Brian Fitzgerald, Walt Scacchi, and Alberto Sillitti (Eds.). Springer US, Boston, MA, 223--228.
[2]
Olga Baysal, Reid Holmes, and Michael W Godfrey. 2013. Developer dashboards: The need for qualitative analytics. IEEE software 30, 4 (2013), 46--52.
[3]
Andrea Bonaccorsi and Cristina Rossi. 2002. Why Open Source software can succeed. Research Policy 32, 7 (2002), 1243--1258.
[4]
Ulrik Brandes, Patrick Kenis, Jürgen Lerner, and Denise Van Raaij. 2009. Network analysis of collaboration structure in Wikipedia. In Proceedings of the 18th international conference on World wide web. ACM, 731--740.
[5]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77--101.
[6]
Brett Cannon. Accessed January, 2019. Becoming an OSS project maintainer is about trust. https://snarky.ca/becoming-an-oss-project-maintainer-is-about-trust/.
[7]
Andrea Capiluppi, Patricia Lago, and Maurizio Morisio. 2003. Evidences in the evolution of OS projects through Changelog Analyses. (01 2003).
[8]
Jailton Coelho and Marco Tulio Valente. 2017. Why modern open source projects fail. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, NY, USA, 186--196.
[9]
Robert E. Cole. 2003. From a Firm-Based to a Community-Based Model of Knowledge Creation: The Case of the Linux Kernel Development. Organization Science 14, 6 (2003), 633--649.
[10]
Jonathan Corbet. 2008. How to participate in the Linux community. A guide to the kernel development process. The Linux Foundation.[52] viitattu 22 (2008), 2017.
[11]
Jonathan Corbet. Accessed January, 2019. Group maintainership models. https://lwn.net/Articles/705228/.
[12]
Jonathan Corbet. Accessed January, 2019. On Linux kernel maintainer scalability. https://lwn.net/Articles/703005/.
[13]
Jonathan Corbet. Accessed January, 2019. Some numbers from the 4.19 development cycle. https://lwn.net/Articles/767635/.
[14]
John W Creswell and J David Creswell. 2017. Research design: Qualitative, quantitative, and mixed methods approaches. Sage publications.
[15]
Daniela S Cruzes and Tore Dyba. 2011. Recommended steps for thematic synthesis in software engineering. In 2011 International Symposium on Empirical Software Engineering and Measurement. IEEE, Banff, AB, Canada, 275--284.
[16]
Nicolas Ducheneaut. 2005. Socialization in an Open Source Software Community: A Socio-Technical Analysis. Computer Supported Cooperative Work 14, 4 (2005), 323--368.
[17]
Nadia Eghbal. 2016. Roads and bridges: The unseen labor behind our digital infrastructure. Ford Foundation.
[18]
Robert Fichman and Chris Kemerer. 1993. Adoption of software engineering process innovations: The case of object orientation. Sloan Management Review 34 (01 1993).
[19]
Catherine O Fritz, Peter E Morris, and Jennifer J Richler. 2012. Effect size estimates: current use, calculations, and interpretation. Journal of experimental psychology: General 141, 1 (2012), 2.
[20]
Daniel M. German. 2003. The GNOME project: a case study of open source, global software development. Software Process Improvement & Practice 8, 4 (2003), 201--215.
[21]
Daniel M German, Bram Adams, and Ahmed E Hassan. 2016. Continuously mining distributed version control systems: an empirical study of how Linux uses Git. Empirical Software Engineering 21, 1 (2016), 260--299.
[22]
Andrew V Goldberg. 1984. Finding a maximum density subgraph. University of California Berkeley, CA, California, USA.
[23]
Georgios Gousios, Martin Pinzger, and Arie van Deursen. 2014. An Exploratory Study of the Pull-based Software Development Model. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, New York, NY, USA, 345--355.
[24]
Georgios Gousios, Margaret-Anne Storey, and Alberto Bacchelli. 2016. Work Practices and Challenges in Pull-based Development: The Contributor's Perspective. In Proceedings of the 38th International Conference on Software Engineering (ICSE '16). ACM, New York, NY, USA, 285--296.
[25]
Georgios Gousios, Andy Zaidman, Margaret-Anne Storey, and Arie van Deursen. 2015. Work Practices and Challenges in Pull-based Development: The Integrator's Perspective. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE '15). IEEE Press, Piscataway, NJ, USA, 358--368. http://dl.acm.org/citation.cfm?id=2818754.2818800
[26]
Bronwyn H Hall and Beethika Khan. 2003. Adoption of new technology. Technical Report. National bureau of economic research.
[27]
Dietmar Harhoff, Joachim Henkel, and Eric Von Hippel. 2003. Profiting from voluntary information spillovers: how users benefit by freely revealing their innovations. Research policy 32, 10 (2003), 1753--1769.
[28]
Michael Hilton. 2016. Understanding and Improving Continuous Integration. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 1066--1067.
[29]
Michael Hilton, Nicholas Nelson, Timothy Tunnell, Darko Marinov, and Danny Dig. 2017. Trade-offs in Continuous Integration: Assurance, Security, and Flexibility. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). ACM, New York, NY, USA, 197--207.
[30]
Michael Hilton, Timothy Tunnell, Kai Huang, Darko Marinov, and Danny Dig. 2016. Usage, Costs, and Benefits of Continuous Integration in Open-source Projects. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016). ACM, New York, NY, USA, 426--437.
[31]
A. Jongyindee, M. Ohira, A. Ihara, and K. Matsumoto. 2011. Good or Bad Committers? A Case Study of Committers' Cautiousness and the Consequences on the Bug Fixing Process in the Eclipse Project. In 2011 Joint Conference of the 21st International Workshop on Software Measurement and the 6th International Conference on Software Process and Product Measurement. IEEE, 116--125.
[32]
Minnesh Kaliprasad. 2006. The human factor I: Attracting, retaining, and motivating capable people. Cost Engineering 48, 6 (2006), 20.
[33]
Oleksii Kononenko, Olga Baysal, and Michael W Godfrey. 2016. Code review quality: how developers see it. In Software Engineering (ICSE), 2016 IEEE/ACM 38th International Conference on. IEEE, Austin, TX, USA, 1028--1038.
[34]
Greg Kroah-Hartman. Accessed January, 2019. I don't want your code: Linux Kernel Maintainers, why are they so grumpy? https://github.com/gregkh/presentation-linux-maintainer/blob/master/maintainer.pdf.
[35]
Georg Von Krogh and Eric Von Hippel. 2003. Special issue on open source software development. Research Policy 32, 7 (2003), 1149--1157.
[36]
C. Lebeuf, M. Storey, and A. Zagalsky. 2018. Software Bots. IEEE Software 35, 1 (January 2018), 18--23.
[37]
John Boaz Lee, Akinori Ihara, Akito Monden, and Ken-ichi Matsumoto. 2013. Patch reviewer recommendation in oss projects. In Software Engineering Conference (APSEC), 2013 20th Asia-Pacific, Vol. 2. IEEE, Williamsburg, VI, USA, 1--6.
[38]
M M Lehman, J F Ramil, P D Wernick, D E Perry, and W M Turski. 1997. Metrics and Laws of Software Evolution - The Nineties View. In Software Metrics Symposium, 1997. Proceedings., Fourth International. IEEE, 20--32.
[39]
Laura Macleod, Michaela Greiler, Margaret Anne Storey, Christian Bird, and Jacek Czerwonka. 2017. Code Reviewing in the Trenches: Understanding Challenges and Best Practices. IEEE Software PP, 99 (2017), 1--1.
[40]
L David Marquet. 2013. Turn the ship around!: A true story of turning followers into leaders. Penguin.
[41]
Shane Mcintosh, Bram Adams, Bram Adams, and Ahmed E. Hassan. 2014. The impact of code review coverage and code review participation on software quality: a case study of the qt, VTK, and ITK projects. In Working Conference on Mining Software Repositories. ACM, New York, NY, USA, 192--201.
[42]
Tom Mens, Maálick Claes, Philippe Grosjean, and Alexander Serebrenik. 2014. Studying evolving software ecosystems based on ecological models. In Evolving Software Systems. Springer, 297--326.
[43]
Rahul Mishra and Ashish Sureka. 2014. Mining Peer Code Review System for Computing Effort and Contribution Metrics for Patch Reviewers. In Mining Unstructured Data. IEEE, Victoria, BC, Canada, 11--15.
[44]
Audris Mockus, Roy T. Fielding, and James Herbsleb. 2000. A Case Study of Open Source Software Development: The Apache Server. In International Conference on Software Engineering. ACM, Limerick, Ireland, 263--272.
[45]
Nadim Nachar et al. 2008. The Mann-Whitney U: A test for assessing whether two independent samples come from the same distribution. Tutorials in Quantitative Methods for Psychology 4, 1 (2008), 13--20.
[46]
Theodore S Rappaport. 2002. Wireless Communications-Principles and Practice, (The Book End). Microwave Journal 45, 12 (2002), 128--129.
[47]
Peter C. Rigby, Daniel M. German, Laura Cowen, and Margaret Anne Storey. 2014. Peer Review on Open-Source Software Projects:Parameters, Statistical Models, and Theory. Acm Transactions on Software Engineering & Methodology 23, 4 (2014), 1--33.
[48]
Walt Scacchi. 2003. Understanding open source software evolution. Applying, breaking and rethinking the laws of software evolution.
[49]
Arielle S Selya, Jennifer S Rose, Lisa C Dierker, Donald Hedeker, and Robin J Mermelstein. 2012. A practical guide to calculating Cohen's f2, a measure of local effect size, from PROC MIXED. Frontiers in psychology 3 (2012), 111.
[50]
Claude Elwood Shannon, Warren Weaver, Bruce Hajek, and Richard E Blahut. 1950. The mathematical theory of communication. Physics Today 3, 9 (1950), 31--32.
[51]
Jagdish N Sheth and Walter H Stellner. 1979. Psychology of innovation resistance: The less developed concept (LDC) in diffusion research. Number 622. College of Commerce and Business Administration, University of Illinois.
[52]
Xin Tan. 2019. Reducing the Workload of the Linux Kernel Maintainers: Multiple-Committer Model. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA, 1205--1207.
[53]
Xin Tan and Minghui Zhou. 2019. How to Communicate when Submitting Patches: An Empirical Study of the Linux Kernel. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 108 (Nov. 2019), 26 pages.
[54]
Linus Torvalds and David Diamond. 2001. Just for Fun: The Story of an Accidental Revolutionary. Harperbusiness 238, 6--7 (2001), S87.
[55]
Daniel Vetter. Accessed January, 2019. Maintainers Don't Scale. https://kernel-recipes.org/en/2016/talks/maintainers-dont-scale/.
[56]
Daniel Vetter. Accessed January, 2019. Vetter: Linux Kernel Maintainer Statistics. https://lwn.net/Articles/752563/.
[57]
Bob Wescott. 2013. Every Computer Performance Book: How to Avoid and Solve Performance Problems on The Computers You Work With. CreateSpace Independent Publishing Platform.
[58]
Mairieli Wessel, Bruno Mendes de Souza, Igor Steinmacher, Igor S. Wiese, Ivanilton Polato, Ana Paula Chaves, and Marco A. Gerosa. 2018. The Power of Bots: Characterizing and Understanding Bots in OSS Projects. Proc. ACM Hum.-Comput. Interact. 2, CSCW, Article 182 (Nov. 2018), 19 pages.
[59]
Mairieli Wessel, Igor Steinmacher, Igor Wiese, and Marco A. Gerosa. 2019. Should I Stale or Should I Close?: An Analysis of a Bot That Closes Abandoned Issues and Pull Requests. In Proceedings of the 1st International Workshop on Bots in Software Engineering (BotSE '19). IEEE Press, Piscataway, NJ, USA, 38--42.
[60]
Joel West and Scott Gallagher. 2006. Challenges of open innovation: the paradox of firm investment in open-source software. R&d Management 36, 3 (2006), 319--331.
[61]
Minghui Zhou, Qingying Chen, Audris Mockus, and Fengguang Wu. 2017. On the Scalability of Linux Kernel Maintainers' Work. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, New York, NY, USA, 27--37.
[62]
Minghui Zhou and Audris Mockus. 2015. Who Will Stay in the FLOSS Community? Modelling Participant's Initial Behaviour. Software Engineering IEEE Transactions on 41, 1 (2015), 82--99.
[63]
Jiaxin Zhu, Minghui Zhou, and Audris Mockus. 2016. Effectiveness of Code Contribution: From Patch-based to Pull-request-based Tools. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, New York, NY, USA, 871--882.

Cited By

View all
  • (2024)How to Gain Commit Rights in Modern Top Open Source Communities?Proceedings of the ACM on Software Engineering10.1145/36607841:FSE(1727-1749)Online publication date: 12-Jul-2024
  • (2024)A Survey on the Densest Subgraph Problem and its VariantsACM Computing Surveys10.1145/365329856:8(1-40)Online publication date: 30-Apr-2024
  • (2024)How Are Paid and Volunteer Open Source Developers Different? A Study of the Rust ProjectProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639197(1-13)Online publication date: 20-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
June 2020
1640 pages
ISBN:9781450371216
DOI:10.1145/3377811
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • KIISE: Korean Institute of Information Scientists and Engineers
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Linux kernel
  2. maintainer
  3. multiple committers
  4. open source communities
  5. scalability
  6. sustainability
  7. workload

Qualifiers

  • Research-article

Funding Sources

  • the National Natural Science Foundation of China Grants
  • the National key R&D Program of China Grant

Conference

ICSE '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)327
  • Downloads (Last 6 weeks)30
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)How to Gain Commit Rights in Modern Top Open Source Communities?Proceedings of the ACM on Software Engineering10.1145/36607841:FSE(1727-1749)Online publication date: 12-Jul-2024
  • (2024)A Survey on the Densest Subgraph Problem and its VariantsACM Computing Surveys10.1145/365329856:8(1-40)Online publication date: 30-Apr-2024
  • (2024)How Are Paid and Volunteer Open Source Developers Different? A Study of the Rust ProjectProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639197(1-13)Online publication date: 20-May-2024
  • (2024)Can instability variations warn developers when open-source projects boost?Empirical Software Engineering10.1007/s10664-024-10482-429:4Online publication date: 14-Jun-2024
  • (2024)Open Source Ecosystem in New Era: Pattern and TrendChina’s e-Science Blue Book 202310.1007/978-981-99-8270-7_11(215-234)Online publication date: 24-Mar-2024
  • (2023)The Design and Practice of the OpenKylin Build and Management System2023 IEEE 3rd International Conference on Software Engineering and Artificial Intelligence (SEAI)10.1109/SEAI59139.2023.10217673(6-10)Online publication date: 16-Jun-2023
  • (2023)Constructing Temporal Networks of OSS Programming Language Ecosystems2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00067(663-667)Online publication date: Mar-2023
  • (2023)Identifying Emergent Leadership in Open Source Software Projects Based on Communication Styles2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00017(73-84)Online publication date: Mar-2023
  • (2023)Do I Belong? Modeling Sense of Virtual Community Among Linux Kernel ContributorsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00038(319-331)Online publication date: 14-May-2023
  • (2023)A Trustworthiness Fuzzy Evaluation Model for Open Source CommunityArtificial Intelligence Logic and Applications10.1007/978-981-99-7869-4_5(61-74)Online publication date: 15-Nov-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media