A comparison of source code plagiarism detection engines
T Lancaster, F Culwin - Computer Science Education, 2004 - Taylor & Francis
T Lancaster, F Culwin
Computer Science Education, 2004•Taylor & FrancisAutomated techniques for finding plagiarism in student source code submissions have been
in use for over 20 years and there are many available engines and services. This paper
reviews the literature on the major modern detection engines, providing a comparison of
them based upon the metrics and techniques they deploy. Generally the most common and
effective techniques are seen to involve tokenising student submissions then searching
pairs of submissions for long common substrings, an example of what is defined to be a …
in use for over 20 years and there are many available engines and services. This paper
reviews the literature on the major modern detection engines, providing a comparison of
them based upon the metrics and techniques they deploy. Generally the most common and
effective techniques are seen to involve tokenising student submissions then searching
pairs of submissions for long common substrings, an example of what is defined to be a …
Automated techniques for finding plagiarism in student source code submissions have been in use for over 20 years and there are many available engines and services. This paper reviews the literature on the major modern detection engines, providing a comparison of them based upon the metrics and techniques they deploy. Generally the most common and effective techniques are seen to involve tokenising student submissions then searching pairs of submissions for long common substrings, an example of what is defined to be a paired structural metric. Computing academics are recommended to use one of the two Web-based detection engines, MOSS and JPlag. It is shown that whilst detection is well established there are still places where further research would be useful, particularly where visual support of the investigation process is possible.
Taylor & Francis Online