Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3468264.3468563acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

Sustainability forecasting for Apache incubator projects

Published: 18 August 2021 Publication History

Abstract

Although OSS development is very popular, ultimately more than 80% of OSS projects fail. Identifying the factors associated with OSS success can help in devising interventions when a project takes a downturn. OSS success has been studied from a variety of angles, more recently in empirical studies of large numbers of diverse projects, using proxies for sustainability, e.g., internal metrics related to productivity and external ones, related to community popularity. The internal socio-technical structure of projects has also been shown important, especially their dynamics. This points to another angle on evaluating software success, from the perspective of self-sustaining and self-governing communities.
To uncover the dynamics of how a project at a nascent development stage gradually evolves into a sustainable one, here we apply a socio-technical network modeling perspective to a dataset of Apache Software Foundation Incubator (ASFI), sustainability-labeled projects. To identify and validate the determinants of sustainability, we undertake a mix of quantitative and qualitative studies of ASFI projects’ socio-technical network trajectories. We develop interpretable models which can forecast a project becoming sustainable with 93+% accuracy, within 8 months of incubation start. Based on the interpretable models we describe a strategy for real-time monitoring and suggesting actions, which can be used by projects to correct their sustainability trajectories.

References

[1]
Chintan Amrit and Jos Van Hillegersberg. 2010. Exploring the impact of soclo-technlcal core-periphery structures in open source software development. journal of information technology, 25, 2 (2010), 216–229.
[2]
Erling S Andersen, Anders Dysvik, and Anne Live Vaagaasar. 2009. Organizational rationality and project management. International Journal of Managing Projects in Business.
[3]
Donald W Barclay. 1991. Interdepartmental conflict in organizational buying: The impact of the organizational context. Journal of Marketing Research, 28, 2 (1991), 145–159.
[4]
Christian Bird, Alex Gourley, Prem Devanbu, Michael Gertz, and Anand Swaminathan. 2006. Mining email social networks. In Proceedings of the 2006 international workshop on Mining software repositories. 137–143.
[5]
Christian Bird, Nachiappan Nagappan, Harald Gall, Brendan Murphy, and Premkumar Devanbu. 2009. Putting it all together: Using socio-technical networks to predict failures. In 2009 20th International Symposium on Software Reliability Engineering. 109–119.
[6]
Casey Casalnuovo, Bogdan Vasilescu, Premkumar Devanbu, and Vladimir Filkov. 2015. Developer onboarding in GitHub: the role of prior social links and language experience. In Proceedings of the 2015 10th joint meeting on foundations of software engineering. 817–828.
[7]
Narciso Cerpa, Matthew Bardeen, Barbara Kitchenham, and June Verner. 2010. Evaluating logistic regression models to estimate software project outcomes. Information and Software Technology, 52, 9 (2010), 934–944.
[8]
Theodore Chaikalis and Alexander Chatzigeorgiou. 2014. Forecasting java software evolution trends employing network models. IEEE Transactions on Software Engineering, 41, 6 (2014), 582–602.
[9]
Jailton Coelho and Marco Tulio Valente. 2017. Why modern open source projects fail. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 186–196.
[10]
Terry Cooke-Davies. 2002. The “real” success factors on projects. International journal of project management, 20, 3 (2002), 185–190.
[11]
Kevin Crowston, James Howison, and Hala Annabi. 2006. Information systems success in free and open source software development: Theory and measures. Software Process: Improvement and Practice, 11, 2 (2006), 123–148.
[12]
Kevin Crowston and Ivan Shamshurin. 2017. Core-periphery communication and the success of free/libre open source software projects. Journal of Internet Services and Applications, 8, 1 (2017), 10.
[13]
Leticia Duboc, Stefanie Betz, Birgit Penzenstadler, Sedef Akinli Kocak, Ruzanna Chitchyan, Ola Leifler, Jari Porras, Norbert Seyff, and Colin C Venters. 2019. Do we really know what we are building? Raising awareness of potential Sustainability Effects of Software Systems in Requirements Engineering. In 2019 IEEE 27th International Requirements Engineering Conference (RE). 6–16.
[14]
Nicolas Ducheneaut. 2005. Socialization in an open source software community: A socio-technical analysis. Computer Supported Cooperative Work (CSCW), 14, 4 (2005), 323–368.
[15]
Juan C Dueñas, Félix Cuadrado, Manuel Santillán, and José L Ruiz. 2007. Apache and Eclipse: Comparing open source project incubators. IEEE software, 24, 6 (2007), 90–98.
[16]
Gerhard Fischer and Thomas Herrmann. 2011. Socio-technical systems: a meta-design perspective. International Journal of Sociotechnology and Knowledge Development (IJSKD), 3, 1 (2011), 1–33.
[17]
Jerome Friedman, Trevor Hastie, and Rob Tibshirani. 2009. glmnet: Lasso and elastic-net regularized generalized linear models. R package version, 1, 4 (2009).
[18]
Jonas Gamalielsson and Björn Lundell. 2014. Sustainability of Open Source software communities beyond a fork: How and why has the LibreOffice project evolved? Journal of Systems and Software, 89 (2014), 128–145.
[19]
Bahar Gezici, Nurseda Özdemir, Nebi Yılmaz, Evren Coşkun, Ayça Tarhan, and Oumout Chouseinoglou. 2019. Quality and Success in Open Source Software: A Systematic Mapping. In 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). 363–370.
[20]
Amir Hossein Ghapanchi. 2015. Predicting software future sustainability: A longitudinal perspective. Information Systems, 49 (2015), 40–51.
[21]
Amir Hossein Ghapanchi, Aybuke Aurum, and Graham Low. 2011. A taxonomy for measuring the success of open source software projects. First Monday, 16, 8 (2011).
[22]
Mohammad Gharehyazie, Daryl Posnett, Bogdan Vasilescu, and Vladimir Filkov. 2015. Developer initiation and social interactions in OSS: A case study of the Apache Software Foundation. Empirical Software Engineering, 20, 5 (2015), 1318–1353.
[23]
Jesús M González-Barahona, Luiz Lopez, and Gregorio Robles. 2004. Community structure of modules in the Apache project. In Proceedings of the 4h International Workshop on Open Source Software Engineering. 44–48.
[24]
Thomas Herrmann, Marcel Hoffmann, Gabriele Kunau, and Kai-Uwe Loser. 2004. A modelling method for the development of groupware applications as socio-technical systems. Behaviour & Information Technology, 23, 2 (2004), 119–135.
[25]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, 9, 8 (1997), 1735–1780.
[26]
Robert Joslin and Ralf Müller. 2016. The impact of project methodologies on project success in different project environments. International Journal of Managing Projects in Business.
[27]
Bakhtiar Khan Kasi. 2014. Minimizing software conflicts through proactive detection of conflicts and task scheduling. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 807–810.
[28]
Jennifer W Kuan. 2001. Open source software as consumer integration into production. Available at SSRN 259648.
[29]
Päivi Lehtonen and Miia Martinsuo. 2006. Three ways to fail in project management and the role of project management methodology. Project Perspectives, 28, 1 (2006), 6–11.
[30]
Hsiu-Fen Lin and Gwo-Guang Lee. 2006. Effects of socio-technical factors on organizational intention to encourage knowledge sharing. Management decision.
[31]
Suresh Marru, Lahiru Gunathilake, Chathura Herath, Patanachai Tangchaisin, Marlon Pierce, Chris Mattmann, Raminder Singh, Thilina Gunarathne, Eran Chinthaka, and Ross Gardler. 2011. Apache airavata: a framework for distributed applications and computational workflows. In Proceedings of the 2011 ACM workshop on Gateway computing environments. 21–28.
[32]
Nora McDonald and Sean Goggins. 2013. Performance and participation in open source software on github. In CHI’13 Extended Abstracts on Human Factors in Computing Systems. 139–144.
[33]
Andrew Meneely and Laurie Williams. 2011. Socio-technical developer networks: Should we trust our measurements? In Proceedings of the 33rd International Conference on Software Engineering. 281–290.
[34]
Vishal Midha and Prashant Palvia. 2012. Factors affecting the success of Open Source Software. Journal of Systems and Software, 85, 4 (2012), 895–905.
[35]
Audris Mockus, Roy T Fielding, and James D Herbsleb. 2002. Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology (TOSEM), 11, 3 (2002), 309–346.
[36]
Marc Palyart, Gail C Murphy, and Vaden Masrani. 2017. A study of social interactions in open source component use. IEEE Transactions on Software Engineering, 44, 12 (2017), 1132–1145.
[37]
JJH Piggott. 2013. Open source software attributes as success indicators. Univ. of Twente.
[38]
Aniket Potdar and Emad Shihab. 2014. An exploratory study on self-admitted technical debt. In 2014 IEEE International Conference on Software Maintenance and Evolution. 91–100.
[39]
Cobra Rahmani and Deepak Khazanchi. 2010. A study on defect density of open source software. In 2010 IEEE/ACIS 9th International Conference on Computer and Information Science. 679–683.
[40]
Uzma Raja and Marietta J Tretter. 2012. Defining and evaluating a measure of open source project survivability. IEEE Transactions on Software Engineering, 38, 1 (2012), 163–174.
[41]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why should I trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144.
[42]
Peter C Rigby and Ahmed E Hassan. 2007. What can oss mailing lists tell us? a preliminary psychometric text analysis of the apache developer mailing list. In Fourth International Workshop on Mining Software Repositories (MSR’07: ICSE Workshops 2007). 23–23.
[43]
Warren Sack, Françoise Détienne, Nicolas Ducheneaut, Jean-Marie Burkhardt, Dilan Mahendran, and Flore Barcellini. 2006. A methodological framework for socio-cognitive analyses of collaborative design of open source software. Computer Supported Cooperative Work (CSCW), 15, 2-3 (2006), 229–250.
[44]
Charles M Schweik and Robert C English. 2012. Internet success: a study of open-source software commons. MIT Press.
[45]
Adrian Smith and Andy Stirling. 2007. Moving outside or inside? Objectification and reflexivity in the governance of socio-technical systems. Journal of Environmental Policy & Planning, 9, 3-4 (2007), 351–373.
[46]
Igor Steinmacher, Tayana Conte, Marco Aurélio Gerosa, and David Redmiles. 2015. Social barriers faced by newcomers placing their first contribution in open source software projects. In Proceedings of the 18th ACM conference on Computer supported cooperative work & social computing. 1379–1392.
[47]
Igor Steinmacher, Marco Aurelio Graciotto Silva, Marco Aurelio Gerosa, and David F Redmiles. 2015. A systematic literature review on the barriers faced by newcomers to open source software projects. Information and Software Technology, 59 (2015), 67–85.
[48]
Chandrasekar Subramaniam, Ravi Sen, and Matthew L Nelson. 2009. Determinants of open source software project success: A longitudinal study. Decision Support Systems, 46, 2 (2009), 576–585.
[49]
Didi Surian, Yuan Tian, David Lo, Hong Cheng, and Ee-Peng Lim. 2013. Predicting project outcome leveraging socio-technical network patterns. In 2013 17th European Conference on Software Maintenance and Reengineering. 47–56.
[50]
Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58, 1 (1996), 267–288.
[51]
Eric Trist. 1981. The evolution of socio-technical systems: A conceptual framework and an action research program. Ontario Ministry of Labour.
[52]
J Rodney Turner and Ralf Müller. 2004. Communication and co-operation on projects between the project owner as principal and the project manager as agent. European management journal, 22, 3 (2004), 327–336.
[53]
Marat Valiev, Bogdan Vasilescu, and James Herbsleb. 2018. Ecosystem-level determinants of sustained activity in open-source projects: A case study of the PyPI ecosystem. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 644–655.
[54]
Stephen Wearne and AAR Stanbury. 1989. A study of the reality of project management: WG Morris and GH Hough, John Wiley, UK (1987)£ 29.95, ISBN 0471 915513 pp 295. International Journal of Project Management, 7, 1 (1989), 58.
[55]
Mairieli Wessel, Bruno Mendes De Souza, Igor Steinmacher, Igor S Wiese, Ivanilton Polato, Ana Paula Chaves, and Marco A Gerosa. 2018. The power of bots: Characterizing and understanding bots in oss projects. Proceedings of the ACM on Human-Computer Interaction, 2, CSCW (2018), 1–19.
[56]
Jing Wu, Khim-Yong Goh, and Qian Tang. 2007. Investigating success of open source software projects: A social network perspective. ICIS 2007 Proceedings, 105.
[57]
L. Yin, Z. Zhang, Q. Xuan, and V. Filkov. 2021. Apache Software Foundation Incubator Project Sustainability Dataset. In 2021 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) (MSR). IEEE Computer Society, Los Alamitos, CA, USA. 595–599. https://doi.org/10.1109/MSR52588.2021.00081
[58]
Marcelo Serrano Zanetti. 2012. The co-evolution of socio-technical structures in sustainable software development: Lessons from the open source software communities. In 2012 34th International Conference on Software Engineering (ICSE). 1587–1590.
[59]
Marcelo Serrano Zanetti, Ingo Scholtes, Claudio Juan Tessone, and Frank Schweitzer. 2013. The rise and fall of a central contributor: Dynamics of social organization and performance in the gentoo community. In 2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE). 49–56.
[60]
Feng Zhang, Ahmed E Hassan, Shane McIntosh, and Ying Zou. 2016. The use of summation to aggregate software metrics hinders the performance of defect prediction models. IEEE Transactions on Software Engineering, 43, 5 (2016), 476–491.
[61]
Jiaxin Zhu and Jun Wei. 2019. An empirical study of multiple names and email addresses in oss version control repositories. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 409–420.

Cited By

View all
  • (2024)From Models to Practice: Enhancing OSS Project Sustainability with Evidence-Based AdviceCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663777(457-461)Online publication date: 10-Jul-2024
  • (2024)Do We Run How We Say We Run? Formalization and Practice of Governance in OSS CommunitiesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641980(1-26)Online publication date: 11-May-2024
  • (2024)Understanding Newcomers’ Onboarding Process in Deep Learning ProjectsIEEE Transactions on Software Engineering10.1109/TSE.2024.335329750:3(443-460)Online publication date: 1-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
August 2021
1690 pages
ISBN:9781450385626
DOI:10.1145/3468264
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Apache Incubator
  2. OSS Sustainability
  3. Sociotechnical System

Qualifiers

  • Research-article

Conference

ESEC/FSE '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)223
  • Downloads (Last 6 weeks)21
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)From Models to Practice: Enhancing OSS Project Sustainability with Evidence-Based AdviceCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663777(457-461)Online publication date: 10-Jul-2024
  • (2024)Do We Run How We Say We Run? Formalization and Practice of Governance in OSS CommunitiesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641980(1-26)Online publication date: 11-May-2024
  • (2024)Understanding Newcomers’ Onboarding Process in Deep Learning ProjectsIEEE Transactions on Software Engineering10.1109/TSE.2024.335329750:3(443-460)Online publication date: 1-Mar-2024
  • (2024)Engineering Formality and Software Risk in Debian Python Packages2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00108(1005-1010)Online publication date: 12-Mar-2024
  • (2024)Sustainability Forecasting for Deep Learning Packages2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00106(981-992)Online publication date: 12-Mar-2024
  • (2024)Free open source communities sustainability: Does it make a difference in software quality?Empirical Software Engineering10.1007/s10664-024-10529-629:5Online publication date: 23-Jul-2024
  • (2024)Can instability variations warn developers when open-source projects boost?Empirical Software Engineering10.1007/s10664-024-10482-429:4Online publication date: 14-Jun-2024
  • (2023)A Grounded Theory of Cross-Community SECOs: Feedback Diversity Versus SynchronizationIEEE Transactions on Software Engineering10.1109/TSE.2023.331387549:10(4731-4750)Online publication date: 18-Sep-2023
  • (2023)GitHub OSS Governance File Dataset2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00089(630-634)Online publication date: May-2023
  • (2022)Integration and Deployment of Cloud-Based Assistance System in Pharaon Large Scale Pilots—Experiences and Lessons LearnedElectronics10.3390/electronics1109149611:9(1496)Online publication date: 6-May-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media