Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3321408.3322862acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesacm-turcConference Proceedingsconference-collections
poster

Measuring code similarity using word mover's distance for programming course

Published: 17 May 2019 Publication History

Abstract

Teachers tend to ask students submit their assignments online not only in online courses but also face to face courses. The phenomena of plagiarism is becoming more and more serious due to the ease with which resources can be found on the Internet also, especially in a computer programming course. This paper aims to develop a robust automated detection technology of code plagiarism towards programming course. After analyzing and summarized state of art of code plagiarism technology, a more robust detection technology is developed by combining word2vec with Word mover's distance (WMD) similarity metric in the paper. We consider the different plagiarism methods when students commit their program source code. Then we collect more than 20 thousands code submissions in our introductory C++ programming course for non-major students and check whether it is a plagiarized code manually. In the process, we examine how our proposed method compare with two other main algorithms and their suitability for different plagiarism characteristics. The results obtained on the dataset indicate that our approach is well suited for detect different types of code plagiarism. We conclude that incorporating WMD similarity metric is crucial for improved effective and adaptability.

References

[1]
Alireza Ahadi and Luke Mathieson. 2019. A Comparison of Three Popular Source code Similarity Tools for Detecting Student Plagiarism. In Proceedings of the Twenty-First Australasian Computing Education Conference. ACM, 112--117.
[2]
Moses S Charikar. 2002. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing. ACM, 380--388.
[3]
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In International Conference on Machine Learning. 957--966.
[4]
Saul Schleimer, Daniel S Wilkerson, and Alex Aiken. 2003. Winnowing: local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data. ACM, 76--85.

Cited By

View all
  • (2021)Source Code Plagiarism Detection in an Educational Context: A Literature Mapping2021 IEEE Frontiers in Education Conference (FIE)10.1109/FIE49875.2021.9637155(1-9)Online publication date: 13-Oct-2021

Index Terms

  1. Measuring code similarity using word mover's distance for programming course

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ACM TURC '19: Proceedings of the ACM Turing Celebration Conference - China
      May 2019
      963 pages
      ISBN:9781450371582
      DOI:10.1145/3321408
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 May 2019

      Check for updates

      Author Tags

      1. automatic detection
      2. computer science education
      3. plagiarism

      Qualifiers

      • Poster

      Conference

      ACM TURC 2019

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 25 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Source Code Plagiarism Detection in an Educational Context: A Literature Mapping2021 IEEE Frontiers in Education Conference (FIE)10.1109/FIE49875.2021.9637155(1-9)Online publication date: 13-Oct-2021

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media