research-article

A longitudinal study on the temporal validity of software samples

Authors:

Juan Andrés Carruthers,

Jorge Andrés Diaz-Pace,

Emanuel IrrazábalAuthors Info & Claims

Volume 168, Issue C

https://doi.org/10.1016/j.infsof.2024.107404

Published: 17 April 2024 Publication History

Abstract

Context

In Empirical Software Engineering, it is crucial to work with representative samples that reflect the current state of the software industry. An important consideration, especially in rapidly changing fields like software development, is that if we use a sample collected years ago, it should continue to represent the same population in the present day to produce generalizable results. However, it is seldom the case in which a software sample built several years ago accurately depicts the current state of the development industry. Nevertheless, many recent studies rely on rather old datasets (seven or more years of age) to conduct their investigations.

Objective

To analyze the evolution of a population of open-source projects, determine the likelihood of detecting significant differences over time, and study the activity history of the projects.

Method

We performed a longitudinal study with 72 snapshots of quality projects from Github, covering the period between July 1^st 2017 and June 1^st 2023. We recorded monthly values of seven repository metrics (contributors, commits, closed pull-requests, merged pull-requests, closed issues, number of stars and forks), encompassing data from a total of 1991 repositories.

Results

We observed significant changes in all the metrics evaluated, with most cases showing negligible to small effect sizes. Notably, merged pull-requests registered medium effect sizes. The evolution was not equal in all the metrics, however, after five years it was unlikely that a sample of projects remained representative for any of the analyzed metrics, showing probabilities below 25%.

Conclusion

Although the temporal validity of a sample depends on the specific data being studied, employing datasets created several years ago does not appear to be a sound strategy if the aim is to produce results that can be extrapolated to the current state of the population.

References

[1]

Y. Hassouneh, H. Turabieh, T. Thaher, I. Tumar, H. Chantar, J. Too, Boosted whale optimization algorithm with natural selection operators for software fault prediction, IEEE Access 9 (2021) 14239–14258,.

[2]

A. Alazba, H. Aljamaan, Code smell detection using feature selection and stacking ensemble: an empirical investigation, Inf. Softw. Technol. 138 (2021),.

[3]

C. Ni, X. Xia, D. Lo, X. Chen, Q. Gu, Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction, IEEE Trans. Softw. Eng. 48 (3) (2022) 786–802,.

[4]

S. Baltes, P. Ralph, Sampling in software engineering research: a critical review and guidelines, Empir. Softw. Eng. 27 (4) (2022) 94,.

Digital Library

[5]

E. Kalliamvakou, G. Gousios, K. Blincoe, L. Singer, D.M. German, D. Damian, The promises and perils of mining GitHub, in: Proceedings of the 11th Working Conference on Mining Software Repositories, 2014, pp. 92–101,.

Digital Library

[6]

N. Munaiah, S. Kroh, C. Cabrey, M. Nagappan, Curating GitHub for engineered software projects, Empir. Softw. Eng. 22 (6) (2017) 3219–3253,.

Digital Library

[7]

T. Xia, W. Fu, R. Shu, R. Agrawal, T. Menzies, Predicting health indicators for open source projects (using hyperparameter optimization, Empir. Softw. Eng. 27 (6) (2022) 122,.

Digital Library

[8]

K. Munger, The limited value of non-replicable field experiments in contexts with low temporal validity, Soc. Media + Soc. 5 (3) (2019),.

[9]

T. Lewowski, L. Madeyski, Creating evolving project data sets in software engineering, Stud. Comput. Intell. 851 (2020) 1–14.

[10]

J.A. Carruthers, J.A. Diaz-Pace, E.A. Irrazabal, How are software datasets constructed in Empirical Software Engineering studies? A systematic mapping study, in: 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2022, pp. 442–450,.

[11]

M. Jureczko, L. Madeyski, Towards identifying software project clusters with regard to defect prediction, in: Proceedings of the 6th International Conference on Predictive Models in Software Engineering - PROMISE ’10, 2010, p. 1,.

Digital Library

[12]

M. Shepperd, Q. Song, Z. Sun, C. Mair, Data quality: some comments on the NASA software defect datasets, IEEE Trans. Softw. Eng. 39 (9) (2013) 1208–1215,.

Digital Library

[13]

P. Afric, L. Sikic, A.S. Kurdija, M. Silic, REPD: source code defect prediction as anomaly detection, J. Syst. Softw. 168 (2020),.

[14]

I.H. Laradji, M. Alshayeb, L. Ghouti, Software defect prediction using ensemble learning on selected features, Inf. Softw. Technol. 58 (2015) 388–402,.

[15]

A. Boucher, M. Badri, Software metrics thresholds calculation techniques to predict fault-proneness: an empirical comparison, Inf. Softw. Technol. 96 (2018) 38–67,.

Digital Library

[16]

E. Tempero, et al., The Qualitas Corpus: a curated collection of Java code for empirical studies, in: Proceedings - Asia-Pacific Software Engineering Conference, APSEC, 2010, pp. 336–345,.

Digital Library

[17]

M.M. Lehman, Programs, life cycles, and laws of software evolution, Proc. IEEE 68 (9) (1980) 1060–1076,.

[18]

A. Ait, J.L.C. Izquierdo, J. Cabot, An empirical study on the survival rate of GitHub projects, in: Proceedings of the 19th International Conference on Mining Software Repositories, 2022, pp. 365–375,.

Digital Library

[19]

J. Coelho, M.T. Valente, L. Milen, L.L. Silva, Is this GitHub project maintained? Measuring the level of maintenance activity of open-source projects, Inf. Softw. Technol. 122 (2020),.

[20]

C. Wohlin, P. Runeson, M. Höst, M.C. Ohlsson, B. Regnell, A. Wesslén, Experimentation in Software Engineering, 9783642290, Berlin, Heidelberg: Springer, Berlin Heidelberg, 2012.

[21]

M. D'Ambros, M. Lanza, R. Robbes, An extensive comparison of bug prediction approaches, in: Proceedings - International Conference on Software Engineering, 2010, pp. 31–41,.

[22]

R. Wu, H. Zhang, S. Kim, S.C. Cheung, ReLink: recovering links between bugs and changes, in: SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering, 2011, pp. 15–25,.

Digital Library

[23]

V. Lenarduzzi, F. Pecorelli, N. Saarimaki, S. Lujan, F. Palomba, A critical comparison on six static analysis tools: detection, agreement, and precision, J. Syst. Softw. 198 (2023),.

Digital Library

[24]

B.L. Sousa, M.A.S. Bigonha, K.A.M. Ferreira, G.C. Franco, A time series-based dataset of open-source software evolution, in: Proceedings of the 19th International Conference on Mining Software Repositories, 2022, pp. 702–706,.

Digital Library

[25]

J. Whitehead, I. Mistrík, J. Grundy, A. van der Hoek, Collaborative Software Engineering: concepts and Techniques, Collaborative Software Engineering, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 1–30.

[26]

K. Crowston, Q. Li, K. Wei, U.Y. Eseryel, J. Howison, Self-organization of teams for free/libre open source software development, Inf. Softw. Technol. 49 (6) (2007) 564–575,.

Digital Library

[27]

B. Gezici, A. Tarhan, O. Chouseinoglou, Internal and external quality in the evolution of mobile software: an exploratory study in open-source market, Inf. Softw. Technol. 112 (2019) 178–200,.

Digital Library

[28]

Y. Yu, H. Wang, G. Yin, T. Wang, Reviewer recommendation for pull-requests in GitHub: what can we learn from code review and bug assignment?, Inf. Softw. Technol. 74 (2016) 204–218,.

Digital Library

[29]

G. Gousios, M. Pinzger, A. van Deursen, An exploratory study of the pull-based software development model, in: Proceedings of the 36th International Conference on Software Engineering, 2014, pp. 345–355,.

Digital Library

[30]

S.G. Eick, T.L. Graves, A.F. Karr, J.S. Marron, A. Mockus, Does code decay? Assessing the evidence from change management data, IEEE Trans. Softw. Eng. 27 (1) (2001) 1–12,.

Digital Library

[31]

C. Laaber, M. Basmaci, P. Salza, Predicting unstable software benchmarks using static source code features, Empir. Softw. Eng. 26 (6) (2021) 114,.

Digital Library

[32]

D.J. Kim, T.-H. Chen, J. Yang, The secret life of test smells - an empirical study on test smell evolution and maintenance, Empir. Softw. Eng. 26 (5) (2021) 100,.

Digital Library

[33]

C. Macho, S. Beyer, S. McIntosh, M. Pinzger, The nature of build changes, Empir. Softw. Eng. 26 (3) (2021) 1–53,.

[34]

L.P. Lima, L.S. Rocha, C.I.M. Bezerra, M. Paixao, Assessing exception handling testing practices in open-source libraries, Empir. Softw. Eng. 26 (5) (2021) 1–39,.

Digital Library

[35]

Z.A. Kermansaravi, M.S. Rahman, F. Khomh, F. Jaafar, Y.G. Guéhéneuc, Investigating design anti-pattern and design pattern mutations and their change- and fault-proneness, Empir. Softw. Eng. 26 (1) (2021) 1–47,.

Digital Library

[36]

G.A.A. Prana, et al., Out of sight, out of mind? How vulnerable dependencies affect open-source projects, Empir. Softw. Eng. 26 (4) (2021) 1–34,.

Digital Library

[37]

E.A. AlOmar, M.W. Mkaouer, A. Ouni, M. Kessentini, On the impact of refactoring on the relationship between quality attributes and design metrics, in: International Symposium on Empirical Software Engineering and Measurement, 2019.

[38]

L. Grammel, H. Schackmann, A. Schröter, C. Treude, M.-A. Storey, Attracting the community's many eyes, Hum. Aspect. Softw. Eng. (2010) 1–6,.

Digital Library

[39]

N. Bettenburg, S. Just, A. Schröter, C. Weiss, R. Premraj, T. Zimmermann, What makes a good bug report?, in: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, 2008, pp. 308–318,.

Digital Library

[40]

O. Jarczyk, B. Gruszka, S. Jaroszewicz, L. Bukowski, A. Wierzbicki, GitHub Projects. Quality Analysis of Open-Source Software, 2014, pp. 80–94.

[41]

H. Borges, M. Tulio Valente, What's in a GitHub star? Understanding repository starring practices in a social coding platform, J. Syst. Softw. 146 (2018) 112–129,.

[42]

J. Coelho, M.T. Valente, Why modern open source projects fail, Proc. 2017 11th Jt. Meet. Found. Softw. Eng. Part F1301 (2017) 186–196,.

Digital Library

[43]

I. Scholtes, P. Mavrodiev, F. Schweitzer, From Aristotle to Ringelmann: a large-scale analysis of team productivity and coordination in Open Source Software projects, Empir. Softw. Eng. 21 (2) (2016) 642–683,.

Digital Library

[44]

J.D. Singer, J.B. Willett, Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence, Oxford University Press, 2003.

[45]

K. Crowston, K. Wei, J. Howison, A. Wiggins, Free/Libre open-source software development, ACM Comput. Surv. 44 (2) (2012) 1–35,.

Digital Library

[46]

V. Cosentino, J. Luis, J. Cabot, Findings from GitHub, in: Proceedings of the 13th International Conference on Mining Software Repositories, 2016, pp. 137–141,.

Digital Library

[47]

V. Cosentino, J.L. Canovas Izquierdo, J. Cabot, A systematic mapping study of software development with GitHub, IEEE Access 5 (2017) 7173–7192,.

[48]

G. Bavota, G. Canfora, M. Di Penta, R. Oliveto, S. Panichella, The evolution of project inter-dependencies in a software ecosystem: the case of Apache, in: 2013 IEEE International Conference on Software Maintenance, 2013, pp. 280–289,.

Digital Library

[49]

G. Bavota, G. Canfora, M. Di Penta, R. Oliveto, S. Panichella, How the Apache community upgrades dependencies: an evolutionary study, Empir. Softw. Eng. 20 (5) (2015) 1275–1317,.

Digital Library

[50]

R. Kikas, G. Gousios, M. Dumas, D. Pfahl, Structure and evolution of package dependency networks, in: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR, 2017, pp. 102–112,.

Digital Library

[51]

M. Fowler, K. Beck, J. Brant, W. Opdyke, D. Roberts, Refactoring: Improving the Design of Existing Code, 2002.

Digital Library

[52]

M. Tufano, et al., When and why your code starts to smell bad (and whether the smells go away), IEEE Trans. Softw. Eng. 43 (11) (2017) 1063–1088,.

Digital Library

[53]

A.-J. Molnar, S. Motogna, Long-term evaluation of technical debt in open-source software, in: Proceedings of the 14th ACM /IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2020, pp. 1–9,.

Digital Library

[54]

A. Trautsch, S. Herbold, J. Grabowski, A longitudinal study of static analysis warning evolution and the effects of PMD on software quality in Apache open source projects, Empir. Softw. Eng. 25 (6) (2020) 5137–5192,.

Digital Library

[55]

I. Chengalur-Smith, A. Sidorova, S. Daniel, Sustainability of free/libre open source projects: a longitudinal study, J. Assoc. Inf. Syst. 11 (11) (2010) 657–683,.

[56]

J.A. Carruthers, J.A. Diaz-Pace, E.A. Irrazabal, A systematic mapping study of empirical studies performed with collections of software projects, Comput. y Sist. 26 (4) (2022),.

[57]

O.J. Dunn, Multiple comparisons among means, J. Am. Stat. Assoc. 56 (293) (1961) 52,.

[58]

S.S. Shapiro, M.B. Wilk, An analysis of variance test for normality (complete samples), Biometrika 52 (3/4) (Dec. 1965) 591,.

[59]

W.H. Kruskal, W.A. Wallis, Use of ranks in one-criterion variance analysis, J. Am. Stat. Assoc. 47 (260) (1952) 583–621,.

[60]

O.J. Dunn, Multiple comparisons using rank sums, Technometrics 6 (3) (1964) 241–252,.

[61]

A. Vargha, H.D. Delaney, A. Vargha, A critique and improvement of the ‘CL’ common language effect size statistics of McGraw and Wong, J. Educ. Behav. Stat. 25 (2) (2000) 101,.

[62]

M.R. Hess, J.D. Kromrey, Robust confidence intervals for effect sizes: a comparative study of cohen's d and cliff's delta under non-normality and heterogeneous variances, in: Annual Meeting of the American Educational Research Association, 2004.

[63]

D.R. Cox, D. Oakes, Analysis of Survival Data, Chapman and Hall/CRC, 2018.

[64]

R. Coelho, L. Almeida, G. Gousios, A. van Deursen, C. Treude, Exception handling bug hazards in Android: results from a mining study and an exploratory survey, Empir. Softw. Eng. 22 (3) (2017) 1264–1304,.

Digital Library

[65]

E. Iannone, R. Guadagni, F. Ferrucci, A. De Lucia, F. Palomba, The secret life of software vulnerabilities: a large-scale empirical study, IEEE Trans. Softw. Eng. 49 (1) (2023) 44–63,.

[66]

E.L. Kaplan, P. Meier, Nonparametric estimation from incomplete observations, J. Am. Stat. Assoc. 53 (282) (1958) 457,.

[67]

G. Gousios, M.-A. Storey, A. Bacchelli, Work practices and challenges in pull-based development, in: Proceedings of the 38th International Conference on Software Engineering, 2016, pp. 285–296,.

Digital Library

[68]

M.M. Lehman, On understanding laws, evolution, and conservation in the large-program life cycle, J. Syst. Softw. 1 (1979) 213–221,.

Digital Library

[69]

M. Caneill, D.M. Germán, S. Zacchiroli, The debsources dataset: two decades of free and open source software, Empir. Softw. Eng. 22 (3) (2017) 1405–1437,.

Digital Library

[70]

L. Hatton, D. Spinellis, M. van Genuchten, The long-term growth rate of evolving software: empirical results and implications, J. Softw. Evol. Process 29 (5) (2017) e1847,.

[71]

G. Rousseau, R. Di Cosmo, S. Zacchiroli, Software provenance tracking at the scale of public source code, Empir. Softw. Eng. 25 (4) (2020) 2930–2959,.

Recommendations

Determinants of open source software project success: A longitudinal study

In this paper, we investigate open source software (OSS) success using longitudinal data on OSS projects. We find that restrictive OSS licenses have an adverse impact on OSS success. On further analysis, restrictive OSS license is found to be negatively ...
A Longitudinal Study of the Impact of Open Source Software Project Characteristics on Positive Outcomes

This article formulates and tests a set of hypotheses about the success of open source software projects with respect to market penetration and human resource attraction. The authors collected data from 1025 open source software projects in a ...
Validity concerns in software engineering research
FoSER '10: Proceedings of the FSE/SDP workshop on Future of software engineering research

Empirical studies that use software repository artifacts have become popular in the last decade due to the ready availability of open source project archives. In this paper, we survey empirical studies in the last three years of ICSE and FSE proceedings,...

Comments

Information & Contributors

Information

Published In

cover image Information and Software Technology

Information and Software Technology Volume 168, Issue C

Apr 2024

131 pages

Issue’s Table of Contents

Elsevier B.V.

Publisher

Butterworth-Heinemann

United States

Publication History

Published: 17 April 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents