Clone detection using abstract syntax suffix trees

R Koschke, R Falke, P Frenzel - 2006 13th Working Conference …, 2006 - ieeexplore.ieee.org
R Koschke, R Falke, P Frenzel
2006 13th Working Conference on Reverse Engineering, 2006ieeexplore.ieee.org
Reusing software through copying and pasting is a continuous plague in software
development despite the fact that it creates serious maintenance problems. Various
techniques have been proposed to find duplicated redundant code (also known as software
clones). A recent study has compared these techniques and shown that token-based clone
detection based on suffix trees is extremely fast but yields clone candidates that are often no
syntactic units. Current techniques based on abstract syntax trees-on the other hand-find …
Reusing software through copying and pasting is a continuous plague in software development despite the fact that it creates serious maintenance problems. Various techniques have been proposed to find duplicated redundant code (also known as software clones). A recent study has compared these techniques and shown that token-based clone detection based on suffix trees is extremely fast but yields clone candidates that are often no syntactic units. Current techniques based on abstract syntax trees-on the other hand-find syntactic clones but are considerably less efficient. This paper describes how we can make use of suffix trees to find clones in abstract syntax trees. This new approach is able to find syntactic clones in linear time and space. The paper reports the results of several large case studies in which we empirically compare the new technique to other techniques using the Bellon benchmark for clone detectors
ieeexplore.ieee.org