research-article

Is cloned code older than non-cloned code?

Author:

Jens KrinkeAuthors Info & Claims

IWSC '11: Proceedings of the 5th International Workshop on Software Clones

Pages 28 - 33

https://doi.org/10.1145/1985404.1985410

Published: 23 May 2011 Publication History

Get Access

Abstract

It is still a debated question whether cloned code causes increased maintenance efforts. If cloned code is more stable than non-cloned code, i.e. it is changed less often, it will require less maintenance efforts. The more stable cloned code is, the longer it will not have been changed, so the stability can be estimated through the code's age. This paper presents a study on the average age of cloned code. For three large open source systems, the age of every line of source code is computed as the date of the last change in that line. In addition, every line is categorized whether it belongs to cloned code as detected by a clone detector. The study shows that on average, cloned code is older than non-cloned code. Moreover, if a file has cloned code, the average age of the cloned code of the file is lower than the average age of the non-cloned code in the same file. The results support the previous findings that cloned code is more stable than non-cloned code.

References

[1]

L. Aversano, L. Cerulo, and M. D. Penta. How clones are maintained: An empirical study. In 11th European Conference on Software Maintenance and Reengineering (CSMR), 2007.

Digital Library

Google Scholar

[2]

N. Bettenburg, W. Shang, W. Ibrahim, B. Adams, Y. Zou, and A. E. Hassan. An empirical study on inconsistent changes to code clones at release level. In Reverse Engineering, Working Conference on, pages 85--94, Los Alamitos, CA, USA, 2009. IEEE Computer Society.

Digital Library

Google Scholar

[3]

J. Cordy. Comprehending reality -- practical barriers to industrial adoption of software maintenance automation. In 11th IEEE International Workshop on Program Comprehension, pages 196--205, 2003.

Digital Library

Google Scholar

[4]

R. Geiger, B. Fluri, H. C. Gall, and M. Pinzger. Relation of code clones and change couplings. In 9th International Conference of Funtamental Approaches to Software Engineering (FASE), number 3922 in LNCS, pages 411--425. Springer, Mar. 2006.

Digital Library

Google Scholar

[5]

N. Göde. Evolution of type-1 clones. In Ninth IEEE International Working Conference on Source Code Analysis and Manipulation, pages 77--86. IEEE Computer Society, 2009.

Digital Library

Google Scholar

[6]

N. Göde and J. Harder. Clone stability. In Proceedings of the 15th European Conference on Software Maintenance and Reengineering, 2011.

Digital Library

Google Scholar

[7]

N. Göde and R. Koschke. Studying clone evolution using incremental clone detection. Journal of Software Maintenance and Evolution: Research and Practice, 2010.

Google Scholar

[8]

K. Hotta, Y. Sano, Y. Higo, and S. Kusumoto. Is duplicate code more frequently modified than non-duplicate code in software evolution?: an empirical study on open source software. In Proceedings of the Joint ERCIM Workshop on Software Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE), IWPSE-EVOL '10, pages 73--82, New York, NY, USA, 2010. ACM.

Digital Library

Google Scholar

[9]

Y. Jia, D. Binkley, M. Harman, J. Krinke, and M. Matsushita. KClone: a proposed approach to fast precise code clone detection. In Third International Workshop on Detection of Software Clones (IWSC), 2009.

Google Scholar

[10]

E. Juergens, F. Deissenboeck, B. Hummel, and S. Wagner. Do code clones matter? In Proceedings of the 31st International Conference on Software Engineering, ICSE '09, pages 485--495, Washington, DC, USA, 2009. IEEE Computer Society.

Digital Library

Google Scholar

[11]

C. Kapser and M. W. Godfrey. Cloning considered harmful considered harmful. In 13th Working Conference on Reverse Engineering (WCRE), pages 19--28, 2006.

Digital Library

Google Scholar

[12]

M. Kim, V. Sazawal, and D. Notkin. An empirical study of code clone genealogies. In Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering (ESEC/FSE), pages 187--196, 2005.

Digital Library

Google Scholar

[13]

J. Krinke. A study of consistent and inconsistent changes to code clones. In 14th Working Conference on Reverse Engineering (WCRE), Oct. 2007.

Digital Library

Google Scholar

[14]

J. Krinke. Is cloned code more stable than non-cloned code? In Eighth IEEE International Working Conference on Source Code Analysis and Manipulation, pages 57--66. IEEE Computer Society, September 2008.

Crossref

Google Scholar

[15]

J. Krinke, N. Gold, Y. Jia, and D. Binkley. Cloning and copying between gnome projects. In 7th IEEE Working Conference on Mining Software Repositories, may 2010.

Crossref

Google Scholar

[16]

J. Krinke, N. Gold, Y. Jia, and D. Binkley. Distinguishing copies from originals in software clones. In International Workshop on Software Clones, May 2010.

Digital Library

Google Scholar

[17]

A. Lozano and M. Wermelinger. Tracking clones' imprint. In Proceedings of the 4th International Workshop on Software Clones, IWSC '10, pages 65--72, New York, NY, USA, 2010. ACM.

Digital Library

Google Scholar

[18]

A. Lozano, M. Wermelinger, and B. Nuseibeh. Evaluating the harmfulness of cloning: A change based experiment. In Proceedings of the Fourth International Workshop on Mining Software Repositories, MSR '07, pages 18--, Washington, DC, USA, 2007. IEEE Computer Society.

Digital Library

Google Scholar

[19]

A. Lozano Rodriguez. Assessing the effect of source code characteristics on changeability. PhD thesis, The Open University, 2009.

Google Scholar

[20]

A. Monden, D. Nakae, T. Kamiya, S. ichi Sato, and K. ichi Matsumoto. Software quality analysis by code clones in industrial legacy software. In Eighth IEEE International Symposium on Software Metrics (METRICS'02), 2002.

Digital Library

Google Scholar

[21]

R. K. Saha, M. Asaduzzaman, M. F. Zibran, C. K. Roy, and K. A. Schneider. Evaluating code clone genealogies at release level: An empirical study. In Source Code Analysis and Manipulation, IEEE International Workshop on, pages 87--96, Los Alamitos, CA, USA, 2010. IEEE Computer Society.

Digital Library

Google Scholar

[22]

G. M. Selim, L. Barbour, W. Shang, B. Adams, A. E. Hassan, and Y. Zou. Studying the impact of clones on software defects. In Reverse Engineering, Working Conference on, pages 13--21, Los Alamitos, CA, USA, 2010. IEEE Computer Society.

Digital Library

Google Scholar

[23]

S. Thummalapenta, L. Cerulo, L. Aversano, and M. Di Penta. An empirical study on the maintenance of source code clones. Empirical Software Engineering, March 2009.

Digital Library

Google Scholar

Cited By

View all

van Bladel BDemeyer S(2023)A Comparative Study of Code Clone Genealogies in Test Code and Production Code2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00110(913-920)Online publication date: Mar-2023
https://doi.org/10.1109/SANER56733.2023.00110
Zhao YMo RZhang YZhang SXiong PRastogi ATufano RBavota GArnaoudova VHaiduc S(2022)Exploring and understanding cross-service code clones in microservice projectsProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527925(449-459)Online publication date: 16-May-2022
https://dl.acm.org/doi/10.1145/3524610.3527925
Jebnoun HRahman MKhomh FMuse B(2022)Clones in deep learning code: what, where, and why?Empirical Software Engineering10.1007/s10664-021-10099-x27:4Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.1007/s10664-021-10099-x
Show More Cited By

Index Terms

Is cloned code older than non-cloned code?
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Project and people management
2. Software and its engineering
  1. Software creation and management
    1. Software development process management
  2. Software notations and tools
    1. Software configuration management and version control systems
    2. Software libraries and repositories

Recommendations

Comparative stability of cloned and non-cloned code: an empirical study
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied Computing

Code cloning is a controversial software engineering practice due to contradictory claims regarding its effect on software maintenance. Code stability is a recently introduced measurement technique that has been used to determine the impact of code ...
CBCD: cloned buggy code detector
ICSE '12: Proceedings of the 34th International Conference on Software Engineering

Developers often copy, or clone, code in order to reuse or modify functionality. When they do so, they also clone any bugs in the original code. Or, different developers may independently make the same mistake. As one example of a bug, multiple ...
A Comparative Study of Bug Patterns in Java Cloned and Non-cloned Code
SCAM '14: Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation

Code cloning via copy-and-paste is a common practice in software engineering. Traditionally, this practice has been considered harmful, and a symptom that some important design abstraction is being ignored. As such, many previous studies suggest ...

Reviews

Reviewer: Brian D. Goodman

One method of reducing software development risk is through the reuse of proven code-typically, source code that a developer or team has used before that is formally an asset or trusted when copied and pasted to perform in a stable manner. This is a common practice since there is a lot of duplicate capability from system to system. Krinke's study looks at this phenomenon and tries to understand whether cloned code complicates software over time. If you are part of a development team and have ever wondered if code cloning is risky, this paper is worth reading. Krinke presents the approaches of empirical studies that have come before him, and differentiates his approach by leveraging the source code repositories of three open-source projects (ArgoUML, JBoss, and jEdit). Most source code repositories, such as the concurrent versions system (CVS) and Subversion, track changes line by line; the premise is that if cloned code changes less often than noncloned code, it would seem that the changes in the overall system are occurring in the newly authored code. Krinke's initial findings show that "cloned code is usually older than non-cloned code" and "the cloned code in a file is usually [81 percent] older than the non-cloned code in the same file." In the projects evaluated, changes over time were not focused on the cloned code and thus it appears to be the more stable overall. The study's findings are interesting because it is often thought that introducing new code or code not authored by the acting development team ("not invented here") introduces risk. This may still be true, but it is not reflected in changes tracked at the code level. There are, however, some challenges with this research. For example, how clone detection is implemented can radically impact what is recorded (false positive or negative). Formatting can often be recorded as a change, but the change itself has no impact on the code itself. The inclusion of test code would likely increase the amount of cloned code detected since it is often auto-generated. More interesting is Krinke's proposal that developers avoid modifying cloned code when it's being leveraged. One of the reasons the code was lifted in the first place was because it did something well known to the exploiter. If the developers change it, the trust that it should operate the same way may be compromised. This behavior may affect the results of this study. For additional details on these issues and reviews of related work be sure to read the whole paper. Otherwise, be on the lookout for a follow-up study that addresses some of the potential shortcomings. As we better understand what happens in practice, more faith and clarity can be brought to bear in future executions. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

IWSC '11: Proceedings of the 5th International Workshop on Software Clones

May 2011

92 pages

ISBN:9781450305884

DOI:10.1145/1985404

Program Chairs:
James R. Cordy
Queen's University, Canada
,
Katsuro Inoue
Osaka University, Japan
,
Stanislaw Jarzabek
National University of Singapore, Singapore
,
Rainer Koschke
University of Bremen, Germany

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 May 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE11

Sponsor:

SIGSOFT

ICSE11: International Conference on Software Engineering

May 23, 2011

HI, Waikiki, Honolulu, USA

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

66
Total Citations
View Citations
215
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

van Bladel BDemeyer S(2023)A Comparative Study of Code Clone Genealogies in Test Code and Production Code2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00110(913-920)Online publication date: Mar-2023
https://doi.org/10.1109/SANER56733.2023.00110
Zhao YMo RZhang YZhang SXiong PRastogi ATufano RBavota GArnaoudova VHaiduc S(2022)Exploring and understanding cross-service code clones in microservice projectsProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527925(449-459)Online publication date: 16-May-2022
https://dl.acm.org/doi/10.1145/3524610.3527925
Jebnoun HRahman MKhomh FMuse B(2022)Clones in deep learning code: what, where, and why?Empirical Software Engineering10.1007/s10664-021-10099-x27:4Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.1007/s10664-021-10099-x
Thaller HLinsbauer LEgyed A(2022)Semantic Clone Detection via Probabilistic Software ModelingFundamental Approaches to Software Engineering10.1007/978-3-030-99429-7_16(288-309)Online publication date: 29-Mar-2022
https://doi.org/10.1007/978-3-030-99429-7_16
Salama MBahsoon RLago P(2021)Stability in Software Engineering: Survey of the State-of-the-Art and Research DirectionsIEEE Transactions on Software Engineering10.1109/TSE.2019.292561647:7(1468-1510)Online publication date: 1-Jul-2021
https://doi.org/10.1109/TSE.2019.2925616
Ragkhitwetsagul CKrinke JPaixao MBianco GOliveto R(2021)Toxic Code Snippets on Stack OverflowIEEE Transactions on Software Engineering10.1109/TSE.2019.290030747:3(560-581)Online publication date: 1-Mar-2021
https://doi.org/10.1109/TSE.2019.2900307
Mondal MRoy BRoy CSchneider K(2021)ID-correspondence: a measure for detecting evolutionary couplingEmpirical Software Engineering10.1007/s10664-020-09921-926:1Online publication date: 12-Jan-2021
https://doi.org/10.1007/s10664-020-09921-9
Mondal MRoy CSchneider K(2021)A Summary on the Stability of Code Clones and Current Research TrendsCode Clone Analysis10.1007/978-981-16-1927-4_12(169-180)Online publication date: 4-Aug-2021
https://doi.org/10.1007/978-981-16-1927-4_12
Mondal MRoy BRoy CSchneider K(2020)Investigating Near-Miss Micro-Clones in Evolving SoftwareProceedings of the 28th International Conference on Program Comprehension10.1145/3387904.3389262(208-218)Online publication date: 13-Jul-2020
https://dl.acm.org/doi/10.1145/3387904.3389262
Haas RNiedermayr RRoehm TApel S(2020)Is Static Analysis Able to Identify Unnecessary Source Code?ACM Transactions on Software Engineering and Methodology10.1145/336826729:1(1-23)Online publication date: 30-Jan-2020
https://dl.acm.org/doi/10.1145/3368267
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Comparative stability of cloned and non-cloned code: an empirical study

CBCD: cloned buggy code detector

A Comparative Study of Bug Patterns in Java Cloned and Non-cloned Code

Reviews

Access critical reviews of Computing literature here

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations