Abstract
From its very inception, the study of software architecture has recognized architectural decay as a regularly occurring phenomenon in long-lived systems. Architectural decay is caused by repeated, sometimes careless changes to a system during its lifespan. Despite decay’s prevalence, there is a relative dearth of empirical data regarding the nature of architectural changes that may lead to decay, and of developers’ understanding of those changes. In this paper, we take a step toward addressing that scarcity by introducing an architecture recovery framework, ARCADE, for conducting large-scale replicable empirical studies of architectural change across different versions of a software system. ARCADE includes two novel architectural change metrics, which are the key to enabling large-scale empirical studies of architectural change. We utilize ARCADE to conduct an empirical study of changes found in software architectures spanning several hundred versions of 23 open-source systems. Our study reveals several new findings regarding the frequency of architectural changes in software systems, the common points of departure in a system’s architecture during the system’s maintenance and evolution, the difference between system-level and component-level architectural change, and the suitability of a system’s implementation-level structure as a proxy for its architecture.
Similar content being viewed by others
Notes
1 The current version of ARCADE (ARCADE 2015) also analyzes and quantifies different symptoms of architectural decay for a given system. However, these features are currently under evaluation and are outside the scope of this paper.
References
Agnew B, Hofmeister C, Purtilo J (1994) Planning for change: a reconfiguration language for distributed systems. Distrib Syst Eng 1(5):313
Amazon (2015) Amazon command line interface. https://aws.amazon.com/cli/
Apache (2014a) Apache portable runtime versioning. http://apr.apache.org/versioning.html
Apache (2014b) Hadoop releases. http://hadoop.apache.org/releases.html#News
Apache (2014c) Lucene wiki. http://en.wikipedia.org/wiki/Lucene
Apache (2015a) Apache ant. http://ant.apache.org/
Apache (2015b) Apache maven. http://maven.apache.org/
ARCADE (2015) arcade:start [USC SoftArch Wiki]. http://softarch.usc.edu/wiki/doku.php?id=arcade:start
Bitbucket (2015) Bitbucket. https://bitbucket.org
Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84
Bouwers E, Correia JP, van Deursen A, Visser J (2011a) Quantifying the analyzability of software architectures. In: 9th working IEEE/IFIP conference on software architecture (WICSA), 2011. IEEE, pp 83–92
Bouwers E, van Deursen A, Visser J (2011b) Dependency profiles for software architecture evaluations. In: 27th IEEE international conference on software maintenance (ICSM), 2011. IEEE, pp 540– 543
Bouwers E, Deursen Av, Visser J (2013) Evaluating usefulness of software metrics: an industrial experience report. In: ICSE. IEEE Press, pp 921–930
Chatzigeorgiou A, Manakos A (2010) Investigating the evolution of bad smells in object-oriented code. In: 17th international conference on the quality of information and communications technology (QUATIC), 2010. IEEE, pp 106–115
D’Ambros M, Gall H, Lanza M, Pinzger M (2008) Analysing software repositories to understand software evolution. In: Software evolution. Springer, pp 37–67
Ducasse S, Pollet D (2009) Software architecture reconstruction: a process-oriented taxonomy. IEEE Trans Softw Eng 35(4):573–591
Eick SG, Graves TL, Karr AF, Marron JS, Mockus A (2001) Does code decay? Assessing the evidence from change management data. IEEE Trans Softw Eng 27(1):1–12
Garcia J, Popescu D, Mattmann C, Medvidovic N, Cai Y (2011) Enhancing architectural recovery using concerns. In: Proceedings of the 2011 26th IEEE/ACM international conference on automated software engineering. IEEE Computer Society, pp 552–555
Garcia J, Krka I, Medvidovic N, Douglas C (2012) A framework for obtaining the ground-truth in architectural recovery. In: Joint working IEEE/IFIP conference on software architecture (WICSA) and European conference on software architecture (ECSA), 2012. IEEE, pp 292–296
Garcia J, Ivkovic I, Medvidovic N (2013a) A comparative analysis of software architecture recovery techniques. In: IEEE/ACM 28th international conference on automated software engineering (ASE), 2013. IEEE, pp 486–496
Garcia J, Krka I, Mattmann C, Medvidovic N (2013b) Obtaining ground-truth software architectures. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 901–910
Ghezzi G, Gall HC (2013) Replicating mining studies with sofas. In: Proceedings of the 10th working conference on mining software repositories. IEEE Press, pp 363–372
Git (2014) Git log. http://git-scm.com/docs/git-log
Git (2015) Github. https://github.com
Godfrey MW, Tu Q (2000) Evolution in open source software: a case study. In: Proceedings of the international conference on software maintenance, 2000. IEEE, pp 131–142
Google (2015a) Google cloud platform. https://cloud.google.com
Google (2015b) Guava. https://code.google.com/p/guava-libraries/
Holt R, Pak JY (1996) Gase: visualizing software evolution-in-the-large. In: Proceedings of the 3rd working conference on reverse engineering, 1996. IEEE, pp 163–167
Kim M, Sazawal V, Notkin D, Murphy G (2005) An empirical study of code clone genealogies. In: ACM SIGSOFT software engineering notes, vol 30. ACM, pp 187–196
Koschke R (2005) What architects should know about reverse engineering and rengineering. In: Null. IEEE, pp 4–10
Koschke R (2009) Architecture reconstruction. In: Software engineering. Springer, pp 140–173
Kruchten PB (1995) The 4+ 1 view model of architecture. IEEE Softw 12(6):42–50
Langhammer M, Shahbazian A, Medvidovic N, Reussner R (2016) Automated extraction of rich software models from limited system information. In: Proceedings of the 13th working IEEE/IFIP conference on software architecture (WICSA). IEEE
Le DM, Behnamghader P, Garcia J, Link D, Shahbazian A, Medvidovic N (2015) An empirical study of architectural change in open-source software systems. In: Proceedings of the 12th working conference on mining software repository (MSR)
Le DM, Carrillo C, Capilla R, Medvidovic N (2016) Relating architectural decay and sustainability of software systems. In: Proceedings of the 13th working IEEE/IFIP conference on software architecture (WICSA). IEEE
Lehman MM (1980) Programs, life cycles, and laws of software evolution. Proc IEEE
Lutellier T, Chollack D, Garcia J, Tan L, Rayside D, Medvidovic N, Kroeger R (2015) Comparing software architecture recovery techniques using accurate dependencies. In: Proceedings of the 37th international conference on software engineering (ICSE 2015). Software Engineering in Practice Track
Mahajan S, Li B, Behnamghader P, Halfond WG (2016) Using visual symptoms for debugging presentation failures in web applications. In: Proceeding of the 9th IEEE international conference on software testing, verification, and validation (ICST)
Maqbool O, Babri H et al (2007) Hierarchical clustering for software architecture recovery. IEEE Trans Softw Eng 33(11):759–780
McCallum A (2002) Mallet: A machine learning for language toolkit
Medvidovic N (1996) Adls and dynamic architecture changes. In: Joint proceedings of the second international software architecture workshop (ISAW-2) and international workshop on multiple perspectives in software development (Viewpoints’ 96) on SIGSOFT’96 workshops. ACM, pp 24–27
Mengué O (2014) Svn graph branches. https://code.google.com/p/svn-graph-branches/
Munkres J (1957) Algorithms for the assignment and transportation problems. J Soc Ind Appl Math 5(1):32–38
Murgia A, Concas G, Pinna S, Tonelli R, Turnu I (2009) Empirical study of software quality evolution in open source projects using agile practices. In: Proceedings of the 1st international symposium on emerging trends in software metrics, 2009. Lulu.com
Nakamura T, Basili VR (2005) Metrics of software architecture changes based on structural distance. In: 11th IEEE international symposium on software metrics, 2005. IEEE, pp 24–24
Oreizy P, Medvidovic N, Taylor RN (1998) Architecture-based runtime software evolution. In: Proceedings of the 20th international conference on Software engineering. IEEE Computer Society, pp 177–186
Perry DE, Wolf AL (1992) Foundations for the study of software architecture. ACM SIGSOFT Softw Eng Notes 17(4):40–52
PMD (2015) Pmd documentation. http://pmd.sourceforge.net
Robles G (2010) Replicating msr: a study of the potential replicability of papers published in the mining software repositories proceedings. In: 7th IEEE working conference on mining software repositories (MSR), 2010. IEEE, pp 171–180
Shahbazian A, Edwards G, Medvidovic N (2016) An end-to-end domain specific modeling and analysis platform. In: IEEE/ACM 38th IEEE international conference on software engineering (ICSE), 2016. IEEE
Shirali S, Vasudeva HL (2005) Metric spaces. Springer Science & Business Media
Struts (2014) Struts wiki. http://en.wikipedia.org/wiki/Apache_Struts
Taylor R, Medvidovic N, Dashofy E (2009) Software architecture: foundations, theory, and practice
Tu Q, Godfrey MW (2002) An integrated approach for studying architectural evolution. In: Proceedings of 10th international workshop on program comprehension, 2002. IEEE, pp 127–136
Tzerpos V, Holt RC (1999) Mojo: a distance metric for software clusterings. In: Proceedings of 6th working conference on reverse engineering, 1999. IEEE, pp 187–193
Tzerpos V, Holt RC (2000) Acdc: An algorithm for comprehension-driven clustering. In: wcre. IEEE, p 258
Van Deursen A, Hofmeister C, Koschke R, Moonen L, Riva C (2004) Symphony: view-driven software architecture reconstruction. In: Proceedings of the 4th working IEEE/IFIP conference on software architecture, 2004 (WICSA 2004). IEEE, pp 122–132
Wen Z, Tzerpos V (2004) An effectiveness measure for software clustering algorithms. In: Proceedings of the 12th IEEE international workshop on program comprehension, 2004. IEEE, pp 194–203
Wettel R, Lanza M (2008) Visual exploration of large-scale system evolution. In: 15th working conference on reverse engineering, 2008 (WCRE’08). IEEE, pp 219–228
Xing EP, Jordan MI, Russell S, Ng AY (2002) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems, pp 505–512
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Romain Robbes, Martin Pinzger and Yasutaka Kamei
Pooyan Behnamghader and Duc Minh Le contributed equally to this work.
Rights and permissions
About this article
Cite this article
Behnamghader, P., Le, D.M., Garcia, J. et al. A large-scale study of architectural evolution in open-source software systems. Empir Software Eng 22, 1146–1193 (2017). https://doi.org/10.1007/s10664-016-9466-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-016-9466-0