Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2998551.2998558acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsercConference Proceedingsconference-collections
research-article

Evaluating plagiarism detection software for introductory programming assignments

Published: 04 July 2016 Publication History

Abstract

Plagiarism is an issue that all educators have had to deal with. Large numbers of students and assignments have resulted in the development of automated systems to detect code similarities with the aim of identifying cases that may have been plagiarised. These systems are of great value to assessors, allowing them to process submissions automatically. However, these automated systems do present possible disadvantages and drawbacks. In this study we explore and analyse the differences between various systems as well as how their performance compares with manual checking. We consider the different methods students use when committing plagiarism. Then we examine more closely the systems that can aid plagiarism detection, ranging from their characteristics to how they work. In the process, we determine how these systems compare with our own system and their suitability for aiding the identification of submissions which may have been plagiarised in our introductory C++ course.

References

[1]
A. Ahtiainen, S. Surakka, and M. Rahikainen. Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises. In Proceedings of the 6th Baltic Sea Conference on Computing Education Research: Koli Calling 2006, Baltic Sea '06, pages 141--142, New York, NY, USA, 2006. ACM.
[2]
A. Aiken. MOSS: a system for detecting software plagiarism. http://theory.stanford.edu/~aiken/moss/, 1994. {Online: accessed 12-January-2013}.
[3]
K. W. Bowyer and L. O. Hall. Experience using \moss" to detect cheating on programming assignments. In Frontiers in Education Conference, 1999. FIE'99. 29th Annual, volume 3, pages 13B3--18. Institute of Electrical and Electronics Engineers, 1999.
[4]
D. M. Breuker, J. Derriks, and J. Brunekreef. Measuring static quality of student code. In Proceedings of the 16th Annual Joint Conference on Innovation and Technology in Computer Science Education, ITiCSE '11, pages 13--17, New York, NY, USA, 2011. ACM.
[5]
R. Brixtel, M. Fontaine, B. Lesner, C. Bazin, and R. Robbes. Language-independent clone detection applied to plagiarism detection. In Source Code Analysis and Manipulation (SCAM), 2010 10th IEEE Working Conference on, pages 77--86, Sept 2010.
[6]
W. B. Cavnar and J. M. Trenkle. N-gram-based text categorization. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval, pages 161--175, Las Vegas, US, 1994.
[7]
X. Chen, B. Francia, M. Li, B. Mckinnon, and A. Seker. Shared information and program plagiarism detection. Information Theory, IEEE Transactions on, 50(7):1545--1551, July 2004.
[8]
T. Copeland. Detecting duplicate code with pmd's cpd, Dec 03 2001. {Online; accessed 5-January-2016}.
[9]
C. Daly and J. Horgan. Patterns of plagiarism. SIGCSE Bull., 37(1):383--387, Feb. 2005.
[10]
M. Freire. Visualizing Program Similarity in the AC Plagiarism Detection System. In Proceedings of the Working Conference on Advanced Visual Interfaces, AVI '08, pages 404--407, New York, NY, USA, 2008. ACM.
[11]
M. Freire, M. Cebrian, and E. del Rosal. Uncovering plagiarism networks. arXiv preprint cs/0703136, 2007.
[12]
M. Freire, M. Cebrian, and E. Rosal. AC: An Integrated Source Code Plagiarism Detection Environment. Technical Report cs.IT/0703136, Universidad Autónoma de Madrid, Mar 2007. Comments: 57 pages, 11 figures.
[13]
D. Gitchell and N. Tran. Sim: A utility for detecting similarity in computer programs. SIGCSE Bull., 31(1):266--270, Mar. 1999.
[14]
J. Hage, P. Rademaker, and N. van Vugt. Plagiarism Detection for Java: A Tool Comparison. In Computer Science Education Research Conference, CSERC '11, pages 33--46, Open Univ., Heerlen, The Netherlands, The Netherlands, 2011. Open Universiteit, Heerlen.
[15]
J. Hage, B. Vermeer, and G. Verburg. Plagiarism Detection for Haskell with Holmes. In Proceedings of the 3rd Computer Science Education Research Conference on Computer Science Education Research, CSERC '13, pages 19--30, Open Univ., Heerlen, The Netherlands, The Netherlands, 2013. Open Universiteit, Heerlen.
[16]
B. Haskins. Utilising n-grams and Edit Distance as a Means of Identifying Copied Programming Assignments. In Proceedings of the 44th Annual Conference of the Southern African Computer Lecturers' Association (SACLA), Port Elizabeth, 25-26 June 2014. SACLA Organising Committee.
[17]
M. Joy and M. Luck. Plagiarism in programming assignments. IEEE Transactions on Education, 42(2):129--133, 1999.
[18]
R. M. Karp and M. O. Rabin. Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development, 31(2):249--260, 1987.
[19]
M. Kaya and S. A. Özel. Integrating an online compiler and a plagiarism detection tool into the moodle distance education system for easy assessment of programming assignments. Computer Applications in Engineering Education, 23(3):363--373, 2015.
[20]
T. Lancaster and F. Culwin. A comparison of source code plagiarism detection engines. Computer Science Education, 14(2):101--112, 2004.
[21]
D. Louw and V. Pieterse. Dealing with plagiarism in introductory programming. In International Conference on Computer Science Education Innovation & Technology (CSEIT). Proceedings, pages 4--13, Singapore, 2015. Global Science and Technology Forum.
[22]
V. T. Martins, D. Fonte, P. R. Henriques, and D. da Cruz. Plagiarism detection: A tool survey and comparison. In A. S. o. Maria João Varanda Pereira, José Paulo Leal, editor, 3rd Symposium on Languages, Applications and Technologies (SLATE'14), pages 143--158. OASICS Schloss Dagstuhl, 2014.
[23]
W. J. Masek and M. S. Paterson. A faster algorithm computing string edit distances. Journal of Computer and System Sciences, 20(1):18--31, 1980.
[24]
M. E. B. Menai and N. S. Al-Hassoun. Similarity detection in Java programming assignments. In Computer Science and Education (ICCSE), 2010 5th International Conference on, pages 356--361. IEEE, 2010.
[25]
V. Pieterse. Automated assessment of programming assignments. In Proceedings of the 3rd Computer Science Education Research Conference on Computer Science Education Research, CSERC '13, pages 4:45--4:56, Open Univ., Heerlen, The Netherlands, The Netherlands, 2013. Open Universiteit, Heerlen.
[26]
V. Pieterse. Decoding code plagiarism. In Proceedings of the 44th Annual Conference of the Southern African Computer Lecturers' Association (SACLA), Port Elizabeth, 25-26 June 2014. SACLA Organising Committee.
[27]
PMD Contributors. Finding duplicate code. http://pmd.sourceforge.net/pmd-4.3.0/cpd.html, 2015. {Online; accessed 5-January-2016}.
[28]
L. Prechelt, G. Malpohl, and M. Philippsen. Finding plagiarisms among a set of programs with jplag. Journal of Universal Computer Science, 8(11):1016--1038, 2002.
[29]
R. Rivest. The MD5 message-digest algorithm. Internet Request For Comments, 1321, 1992.
[30]
C. K. Roy and J. R. Cordy. A survey on software clone detection research. Technical Report TR 2007-541, Queens University, 2007.
[31]
SAFE Corporation. Code suite products. http://www.safe-corp.biz/products_codesuite.html, 2015. {Online; accessed 5-January-2016}.
[32]
SAFE Corporation. CodeMatch Algorithms. http://www.safe-corp.biz/CodeMatch_algorithms.htm, 2015. {Online; accessed 5-January-2016}.
[33]
S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pages 76--85. ACM, 2003.
[34]
S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD '03, pages 76--85, New York, NY, USA, 2003. ACM.
[35]
I. L. Schoeman and V. Pieterse. Managing programming assignments in the computer science classroom. In Proceedings of the 34th Annual Conference of the Southern African Computer Lecturers' Association (SACLA) (4-6 July), SACLA '04, pages 50--59, 2004.
[36]
R. A. Wagner and M. J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168--173, Jan. 1974.
[37]
A. T. Wibowo, K. W. Sudarmadi, and A. M. Barmawi. Comparison between fingerprint and winnowing algorithm to detect plagiarism fraud on Bahasa Indonesia documents. In Information and Communication Technology (ICoICT), 2013 International Conference of, pages 128--133. IEEE, 2013.
[38]
B. Zeidman. Tools and algorithms for finding plagiarism in source code. Dr Dobbs, July 01 2004.

Cited By

View all
  1. Evaluating plagiarism detection software for introductory programming assignments

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CSERC '16: Proceedings of the Computer Science Education Research Conference 2016
    July 2016
    52 pages
    ISBN:9781450344920
    DOI:10.1145/2998551
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 July 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Automatic detection
    2. Code plagiarism
    3. Introductory programming

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CSERC '16

    Acceptance Rates

    CSERC '16 Paper Acceptance Rate 5 of 14 submissions, 36%;
    Overall Acceptance Rate 24 of 60 submissions, 40%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The work of art in the age of artificial intelligibilityAI & SOCIETY10.1007/s00146-023-01845-4Online publication date: 28-Mar-2024
    • (2023)Codeflex 2.0Internet of Behaviors Implementation in Organizational Contexts10.4018/978-1-6684-9039-6.ch003(40-67)Online publication date: 30-Jun-2023
    • (2022)Evaluation of Different Plagiarism Detection Methods: A Fuzzy MCDM PerspectiveApplied Sciences10.3390/app1209458012:9(4580)Online publication date: 30-Apr-2022
    • (2022)Measuring Plagiarism in Introductory Programming Course Assignments2022 8th International Conference on Information Technology Trends (ITT)10.1109/ITT56123.2022.9863961(80-87)Online publication date: 25-May-2022
    • (2022)Students’ perception of academic dishonesty in programming coursesJournal of Further and Higher Education10.1080/0309877X.2022.209363047:1(72-88)Online publication date: 12-Jul-2022
    • (2021)Source Code Plagiarism Detection in an Educational Context: A Literature Mapping2021 IEEE Frontiers in Education Conference (FIE)10.1109/FIE49875.2021.9637155(1-9)Online publication date: 13-Oct-2021
    • (2020)Choosing Code Segments to Exclude from Code Similarity DetectionProceedings of the Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3437800.3439201(1-19)Online publication date: 17-Jun-2020
    • (2020)Preprocessing for Source Code Similarity Detection in Introductory ProgrammingProceedings of the 20th Koli Calling International Conference on Computing Education Research10.1145/3428029.3428065(1-10)Online publication date: 19-Nov-2020
    • (2019)Plagiarism in Programming AssessmentsACM Transactions on Computing Education10.1145/337115620:1(1-28)Online publication date: 9-Dec-2019
    • (2019)Similarity Detection Techniques for Academic Source Code Plagiarism and Collusion: A Review2019 IEEE International Conference on Engineering, Technology and Education (TALE)10.1109/TALE48000.2019.9225953(1-8)Online publication date: Dec-2019
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media