Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3540250.3549132acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

Code, quality, and process metrics in graduated and retired ASFI projects

Published: 09 November 2022 Publication History

Abstract

Recent work on open source sustainability shows that successful trajectories of projects in the Apache Software Foundation Incubator (ASFI) can be predicted early on, using a set of socio-technical measures. Because OSS projects are socio-technical systems centered around code artifacts, we hypothesize that sustainable projects may exhibit different code and process patterns than unsustainable ones, and that those patterns can grow more apparent as projects evolve over time. Here we studied the code and coding processes of over 200 ASFI projects, and found that ASFI graduated projects have different patterns of code quality and complexity than retired ones. Likewise for the coding processes – e.g., feature commits or bug-fixing commits are correlated with project graduation success. We find that minor contributors and major contributors (who contribute <5%, respectively >=95% commits) associate with graduation outcomes, implying that having also developers who contribute fewer commits are important for a project’s success. This study provides evidence that OSS projects, especially nascent ones, can benefit from introspection and instrumentation using multidimensional modeling of the whole system, including code, processes, and code quality measures, and how they are interconnected over time.

References

[1]
Dimitrios Athanasiou, Ariadi Nugroho, Joost Visser, and Andy Zaidman. 2014. Test Code Quality and Its Relation to Issue Handling Performance. IEEE Transactions on Software Engineering, 40, 11 (2014), 1100–1125. https://doi.org/10.1109/TSE.2014.2342227
[2]
Robert Freed Bales. 2017. Social Interaction Systems: Theory and Measurement. Routledge. https://doi.org/10.4324/9781315129563
[3]
Rajiv D. Banker, Srikant M. Datar, Chris F. Kemerer, and Dani Zweig. 1993. Software Complexity and Maintenance Costs. Commun. ACM, 36, 11 (1993), nov, 81–94. issn:0001-0782 https://doi.org/10.1145/163359.163375
[4]
Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67, 1 (2015), 1–48. https://doi.org/10.18637/jss.v067.i01
[5]
Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67, 1 (2015), 1–48. https://doi.org/10.18637/jss.v067.i01
[6]
Christian Bird, Nachiappan Nagappan, Brendan Murphy, Harald Gall, and Premkumar Devanbu. 2011. Don’t Touch My Code! Examining the Effects of Ownership on Software Quality. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. 4–14. https://doi.org/10.1145/2025113.2025119
[7]
Hudson Borges, Andre Hora, and Marco Tulio Valente. 2016. Understanding the Factors That Impact the Popularity of GitHub Repositories. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME). 334–344. https://doi.org/10.1109/ICSME.2016.31
[8]
Jailton Coelho and Marco Tulio Valente. 2017. Why Modern Open Source Projects Fail. In Proceedings of the 2017 11th Joint meeting on Foundations of Software Engineering (ESEC/FSE 2017). 186–196. https://doi.org/10.1145/3106237.3106246
[9]
Melvin E Conway. 1968. How do committees invent. Datamation, 14, 4 (1968), 28–31.
[10]
Kevin Crowston, Hala Annabi, and James Howison. 2003. Defining Open Source Software Project Success.
[11]
Frank DeRemer and Hans H Kron. 1976. Programming-in-the-large versus programming-in-the-small. IEEE Transactions on Software Engineering, 80–86.
[12]
Edson Dias, Paulo Meirelles, Fernando Castor, Igor Steinmacher, Igor Wiese, and Gustavo Pinto. 2021. What Makes a Great Maintainer of Open Source Projects? In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 982–994. https://doi.org/10.1109/ICSE43902.2021.00093
[13]
Georgios Digkas, Mircea Lungu, Alexander Chatzigeorgiou, and Paris Avgeriou. 2017. The Evolution of Technical Debt in the Apache Ecosystem. In European Conference on Software Architecture (ECSA). 51–66. https://doi.org/10.1007/978-3-319-65831-5_4
[14]
Lex Donaldson. 2001. The Contingency Theory of Organizations. Sage. https://doi.org/10.4135/9781452229249
[15]
Geanderson E dos Santos and Eduardo Figueiredo. 2020. Commit Classification using Natural Language Processing: Experiments over Labeled Datasets. In CIbSE. 110–123.
[16]
Matthieu Foucault, Marc Palyart, Xavier Blanc, Gail C. Murphy, and Jean-Rémy Falleri. 2015. Impact of Developer Turnover on Quality in Open-Source Software. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE). 829–841. https://doi.org/10.1145/2786805.2786870
[17]
Santiago Gala-Pérez, Gregorio Robles, Jesús M González-Barahona, and Israel Herraiz. 2013. Intensive Metrics for the Study of the Evolution of Open Source Projects: Case studies from Apache Software Foundation projects. In 2013 10th Working Conference on Mining Software Repositories (MSR). 159–168. https://doi.org/10.1109/MSR.2013.6624023
[18]
Amir Hossein Ghapanchi. 2015. Predicting software future sustainability: A longitudinal perspective. Information Systems, 49 (2015), 40–51. issn:0306-4379 https://doi.org/10.1016/j.is.2014.10.005
[19]
Javier Luis Cánovas Izquierdo, Valerio Cosentino, and Jordi Cabot. 2017. An Empirical Study on the Maturity of the Eclipse Modeling Ecosystem. In 2017 ACM/IEEE 20th International Conference on Model Driven Engineering Languages and Systems (MODELS). 292–302. https://doi.org/10.1109/MODELS.2017.19
[20]
Yue Jiang, Bojan Cukic, Tim Menzies, and Nick Bartlow. 2008. Comparing Design and Code Metrics for Software Quality Prediction. In Proceedings of the 4th International Workshop on Predictor Models in Software Engineering (PROMISE). 11–18. https://doi.org/10.1145/1370788.1370793
[21]
Mitchell Joblin and Sven Apel. 2021. How Do Successful and Failed Projects Differ? A Socio-Technical Analysis. ACM Trans. Softw. Eng. Methodol., 31, 4 (2021), Article 67, 24 pages. https://doi.org/10.1145/3504003
[22]
Cory J Kapser and Michael W Godfrey. 2008. “Cloning considered harmful” considered harmful: patterns of cloning in software. Empirical Software Engineering, 13, 6 (2008), 645–692. https://doi.org/10.1007/s10664-008-9076-6
[23]
Daniel Lüdecke. 2021. sjPlot: Data Visualization for Statistics in Social Science. https://CRAN.R-project.org/package=sjPlot R package version 2.8.10
[24]
Daniel Lüdecke, Mattan S. Ben-Shachar, Indrajeet Patil, Philip Waggoner, and Dominique Makowski. 2021. performance: An R Package for Assessment, Comparison and Testing of Statistical Models. Journal of Open Source Software, 6, 60 (2021), 3139. https://doi.org/10.21105/joss.03139
[25]
Sang-Yong Tom Lee, Hee-Woong Kim, and Sumeet Gupta. 2009. Measuring open source software success. Omega, 37, 2 (2009), 426–438. https://doi.org/10.1016/j.omega.2007.05.005
[26]
Aversano Lerina and Laura Nardi. 2019. Investigating on the Impact of Software Clones on Technical Debt. In 2019 IEEE/ACM International Conference on Technical Debt (TechDebt). 108–112. https://doi.org/10.1109/TechDebt.2019.00029
[27]
Tony Liu, Lyle Ungar, and Konrad Kording. 2021. Quantifying causality in data science with quasi-experiments. Nature Computational Science, 1, 1 (2021), 24–32.
[28]
Log4j Project. 2022. https://logging.apache.org/log4j/2.x/ Accessed: 2022-03-10
[29]
Radu Marinescu. 2004. Detection Strategies: Metrics-Based Rules for Detecting Design Flaws. In Proceedings of the 20th IEEE International Conference on Software Maintenance (ICSM). 350–359. https://doi.org/10.1109/ICSM.2004.1357820
[30]
Robert C Martin. 2009. Clean code: a handbook of agile software craftsmanship. Pearson Education.
[31]
Vishal Midha and Prashant Palvia. 2012. Factors affecting the success of Open Source Software. Journal of Systems and Software, 85, 4 (2012), 895–905. https://doi.org/10.1016/j.jss.2011.11.010
[32]
Naouel Moha, Yann-Gaël Guéhéneuc, Laurence Duchien, and Anne-Francoise Le Meur. 2009. DECOR: A Method for the Specification and Detection of Code and Design Smells. IEEE Transactions on Software Engineering, 36, 1 (2009), 20–36. https://doi.org/10.1109/TSE.2009.50
[33]
Nachiappan Nagappan, Brendan Murphy, and Victor Basili. 2008. The Influence of Organizational Structure on Software Quality: An Empirical Case Study. In ACM/IEEE 30th International Conference on Software Engineering (ICSE). 521–530. https://doi.org/10.1145/1368088.1368160
[34]
Shinichi Nakagawa and Holger Schielzeth. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution, 4, 2 (2013), 133–142.
[35]
Thomas J. Ostrand, Elaine J. Weyuker, and Robert M. Bell. 2004. Where the Bugs Are. ACM SIGSOFT Software Engineering Notes, 29, 4 (2004), 86–96. https://doi.org/10.1145/1013886.1007524
[36]
Foyzur Rahman and Premkumar Devanbu. 2013. How, and Why, Process Metrics Are Better. In 2013 35th International Conference on Software Engineering (ICSE). 432–441.
[37]
Cobra Rahmani and Deepak Khazanchi. 2010. A Study on Defect Density of Open Source Software. In 2010 IEEE/ACIS 9th International Conference on Computer and Information Science. 679–683. https://doi.org/10.1109/ICIS.2010.11
[38]
Robert W Ruekert, Orville C Walker Jr, and Kenneth J Roering. 1985. The organization of marketing activities: a contingency theory of structure and performance. Journal of marketing, 49, 1 (1985), 13–25.
[39]
Alexander Sachs. 2019. Predicting Repository Upkeep with Textual Personality Analysis. Master’s thesis. University of Waterloo.
[40]
Charles M Schweik. 2013. Sustainability in open source software commons: Lessons learned from an empirical study of sourceforge projects. Technology Innovation Management Review, 3, 1 (2013).
[41]
Raed Shatnawi and Wei Li. 2008. The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. Journal of systems and software, 81, 11 (2008), 1868–1882. https://doi.org/10.1016/j.jss.2007.12.794
[42]
Yonghee Shin, Andrew Meneely, Laurie Williams, and Jason A Osborne. 2010. Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities. IEEE Transactions on Software Engineering, 37, 6 (2010), 772–787. https://doi.org/10.1109/TSE.2010.81
[43]
Chandrasekar Subramaniam, Ravi Sen, and Matthew L Nelson. 2009. Determinants of open source software project success: A longitudinal study. Decision Support Systems, 46, 2 (2009), 576–585. https://doi.org/10.1016/j.dss.2008.10.005
[44]
The MITRE Corporation. 2022. CVE-2021-44228. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44228 Accessed: 2022-03-10
[45]
Edith Tom, Aybüke Aurum, and Richard Vidgen. 2013. An exploration of technical debt. Journal of Systems and Software, 86, 6 (2013), 1498–1516. https://doi.org/10.1016/j.jss.2012.12.052
[46]
Andrew H Van de Ven and Robert Drazin. 1984. The concept of fit in contingency theory. Minnesota Univ Minneapolis Strategic Management Research Center.
[47]
Bogdan Vasilescu, Kelly Blincoe, Qi Xuan, Casey Casalnuovo, Daniela Damian, Premkumar Devanbu, and Vladimir Filkov. 2016. The Sky Is Not the Limit: Multitasking Across GitHub Projects. In Proceedings of the 38th International Conference on Software Engineering (ICSE). 994–1005. https://doi.org/10.1145/2884781.2884875
[48]
Carmine Vassallo, Fabio Palomba, Alberto Bacchelli, and Harald C. Gall. 2018. Continuous code quality: are we (really) doing that? In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE). 790–795. https://doi.org/10.1145/3238147.3240729
[49]
Jing Wu, Khim-Yong Goh, and Qian Tang. 2007. Investigating Success of Open Source Software Projects: A Social Network Perspective. ICIS 2007 Proceedings, 105.
[50]
Tianpei Xia, Wei Fu, Rui Shu, and Tim Menzies. 2020. Predicting project health for open source projects (using the DECART hyperparameter optimizer). arXiv preprint arXiv:2006.07240.
[51]
Likang Yin, Mahasweta Chakraborty, Charles Schweik, Seth Frey, and Vladimir Filkov. 2022. Open Source Software Sustainability: Combining Institutional Analysis and Socio-Technical Networks. Accepted at CSCW 2022, arXiv preprint arXiv:2203.03144, arxiv:2203.03144
[52]
Likang Yin, Zhuangzhi Chen, Qi Xuan, and Vladimir Filkov. 2021. Sustainability Forecasting for Apache Incubator Projects. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 1056–1067. https://doi.org/10.1145/3468264.3468563
[53]
Yang Zhang, Bogdan Vasilescu, Huaimin Wang, and Vladimir Filkov. 2018. One Size Does Not Fit All: An Empirical Study of Containerized Continuous Deployment Workflows. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 295–306. https://doi.org/10.1145/3236024.3236033

Cited By

View all
  • (2024)Do We Run How We Say We Run? Formalization and Practice of Governance in OSS CommunitiesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641980(1-26)Online publication date: 11-May-2024
  • (2024)Sustainability Forecasting for Deep Learning Packages2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00106(981-992)Online publication date: 12-Mar-2024
  • (2024)Free open source communities sustainability: Does it make a difference in software quality?Empirical Software Engineering10.1007/s10664-024-10529-629:5Online publication date: 23-Jul-2024
  • Show More Cited By

Index Terms

  1. Code, quality, and process metrics in graduated and retired ASFI projects
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    November 2022
    1822 pages
    ISBN:9781450394130
    DOI:10.1145/3540250
    This work is licensed under a Creative Commons Attribution 4.0 International License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 November 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Code Quality
    2. Open Source Sustainability

    Qualifiers

    • Research-article

    Conference

    ESEC/FSE '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 112 of 543 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)265
    • Downloads (Last 6 weeks)22
    Reflects downloads up to 24 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Do We Run How We Say We Run? Formalization and Practice of Governance in OSS CommunitiesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641980(1-26)Online publication date: 11-May-2024
    • (2024)Sustainability Forecasting for Deep Learning Packages2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00106(981-992)Online publication date: 12-Mar-2024
    • (2024)Free open source communities sustainability: Does it make a difference in software quality?Empirical Software Engineering10.1007/s10664-024-10529-629:5Online publication date: 23-Jul-2024
    • (2023)Climate Coach: A Dashboard for Open-Source Maintainers to Overview Community DynamicsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581317(1-18)Online publication date: 19-Apr-2023
    • (2023)GitHub OSS Governance File Dataset2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00089(630-634)Online publication date: May-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media