An empirical study of code clone genealogies

M Kim, V Sazawal, D Notkin, G Murphy - Proceedings of the 10th …, 2005 - dl.acm.org
M Kim, V Sazawal, D Notkin, G Murphy
Proceedings of the 10th European software engineering conference held …, 2005dl.acm.org
It has been broadly assumed that code clones are inherently bad and that eliminating clones
by refactoring would solve the problems of code clones. To investigate the validity of this
assumption, we developed a formal definition of clone evolution and built a clone genealogy
tool that automatically extracts the history of code clones from a source code repository.
Using our tool we extracted clone genealogy information for two Java open source projects
and analyzed their evolution. Our study contradicts some conventional wisdom about …
Abstract
It has been broadly assumed that code clones are inherently bad and that eliminating clones by refactoring would solve the problems of code clones. To investigate the validity of this assumption, we developed a formal definition of clone evolution and built a clone genealogy tool that automatically extracts the history of code clones from a source code repository. Using our tool we extracted clone genealogy information for two Java open source projects and analyzed their evolution.
Our study contradicts some conventional wisdom about clones. In particular, refactoring may not always improve software with respect to clones for two reasons. First, many code clones exist in the system for only a short time; extensive refactoring of such short-lived clones may not be worthwhile if they are likely diverge from one another very soon. Second, many clones, especially long-lived clones that have changed consistently with other elements in the same group, are not easily refactorable due to programming language limitations. These insights show that refactoring will not help in dealing with some types of clones and open up opportunities for complementary clone maintenance tools that target these other classes of clones.
ACM Digital Library