research-article

Open access

Style Anomalies Can Suggest Cheating in CS1 Programs

Authors:

Benjamin Denzler,

Mariam SalloumAuthors Info & Claims

ITiCSE 2024: Proceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1

Pages 381 - 387

https://doi.org/10.1145/3649217.3653626

Published: 03 July 2024 Publication History

Abstract

Student cheating on at-home programming assignments is a well- known problem. A key contributor is externally-obtained solutions from websites, contractors, and recently generative AI. In our experience, such externally-obtained solutions often use coding styles that depart from a class' style, which we call "style anomalies," such as using untaught or advanced constructs like pointers or ternary operators, or having different indenting or brace usage from the class style. We developed a tool to auto-count style anomalies. For six labs across four terms in 2021-2022, and 50 sampled students per lab, we found 18% of submissions on average had unusually-high style anomaly counts. Importantly, 8% of submissions on average had a high style anomaly count but were not flagged by a similarity checker, meaning 8% of submissions are suspicious but might have been missed if using similarity checking alone. We repeated a similar analysis for Spring 2023 when generative AI (ChatGPT) was gaining popularity, and the numbers rose to 26% and 18%, respectively. Detailed investigations by instructors led to a majority (but not all) high style anomaly submissions being deemed cheating. Even for high-similarity submissions, counting style anomalies can help instructors focus investigations on the most-likely cheating cases, and can strengthen cases sent to student conduct offices. With the rise of externally-obtained solutions from websites, contractors, and generative AI, counting style anomalies may become an increasingly important complement to similarity checking; in fact, it is now the primary cheat-detection tool in our CS1 at a large state university, with similarity secondary.

References

[1]

Susan Adams. 2021. This textdollar12 Billion Company Is Getting Rich Off Students Cheating Their Way Through Covid. Forbes. https://www.forbes.com/sites/susanadams/2021/01/28/this-12-billion-company-is-getting-rich-off-students-cheating-their-way-through-covid/'sh=7c5af227363f

[2]

Kirsti Ala-Mutka, Toni Uimonen, and Hannu-Matti Järvinen. 2004. Supporting Students in C Programming Courses with Automatic Program Style Assessment. JITE, Vol. 3 (01 2004), 245--262. https://doi.org/10.28945/300

[3]

Ibrahim Albluwi. 2019. Plagiarism in Programming Assessments: A Systematic Review. ACM Trans. Comput. Educ., Vol. 20, 1, Article 6 (dec 2019), bibinfonumpages28 pages. https://doi.org/10.1145/3371156

Digital Library

[4]

Wolfram Alpha. [n.,d.]. WolframAlpha.com. https://www.wolframalpha.com/ Retrieved December, 2022 from

[5]

S. Arabyarmohamady, Hadi Moradi, and Masoud Asadpour. 2012. A coding style-based plagiarism detection. 2012 International Conference on Interactive Mobile and Computer Aided Learning, IMCL 2012, 180--186. https://doi.org/10.1109/IMCL.2012.6396471

[6]

Chegg. [n.,d.]. Chegg.com. http://chegg.com Retrieved 2022 from

[7]

cpplint. [n.,d.]. cpplint - static code checker for C. https://github.com/cpplint/cpplint Retrieved 2022 from

[8]

Jianyang Deng and Yijia Lin. 2023. The Benefits and Challenges of ChatGPT: An Overview. Frontiers in Computing and Intelligent Systems, Vol. 2 (01 2023), 81--83. https://doi.org/10.54097/fcis.v2i2.4465

[9]

GitHub. [n.,d.]. GitHub.com. https://github.com/ Retrieved August, 2022 from

[10]

Google. [n.,d.]. Google C style guide. https://google.github.io/styleguide/cppguide.html Retrieved 2022 from

[11]

JPlag. [n.,d.]. JPlag software plagiarism detector. https://jplag.ipd.kit.edu/ Retrieved 2022 from

[12]

Oscar Karnalim and Gisela Kurniawati. 2020. Programming Style on Source Code Plagiarism and Collusion Detection. International Journal of Computing, Vol. 19, 1 (Mar. 2020), 27--38. https://doi.org/10.47839/ijc.19.1.1690

[13]

Sad CS Major. 2017. How UCLA Admins Could Stop The GitHub Cheating and Let Us Get Back To Learning CS. https://medium.com/@joe_bruined/how-ucla-admins-could-stop-the-github-cheating-and-let-us-get-back-to-learning-cs-8f95e8bf6850

[14]

Olfat M. Mirza, Mike Joy, and Georgina Cosma. 2017. Style Analysis for Source Code Plagiarism Detection - An Analysis of a Dataset of Student Coursework. In 2017 IEEE 17th International Conference on Advanced Learning Technologies (ICALT). 296--297. https://doi.org/10.1109/ICALT.2017.117

[15]

Navstem. [n.,d.]. CS1 report. http://navstem.com Retrieved 2021 from

[16]

Nhan Nguyen and Sarah Nadi. 2022. An Empirical Evaluation of GitHub Copilot's Code Suggestions. In 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR). 1--5. https://doi.org/10.1145/3524842.3528470

Digital Library

[17]

P. W. Oman and C. R. Cook. 1989. Programming style authorship analysis. In Proceedings of the 17th Conference on ACM Annual Computer Science Conference (Louisville, Kentucky) (CSC '89). Association for Computing Machinery, New York, NY, USA, 320--326. https://doi.org/10.1145/75427.75469

Digital Library

[18]

OpenAI. 2023. ChatGPT. OpenAI. https://chat.openai.com/chat [Large language model].

[19]

Lutz Prechelt and Guido Malpohl. 2003. Finding Plagiarisms among a Set of Programs with JPlag. Journal of Universal Computer Science, Vol. 8 (03 2003).

[20]

Python Software Foundation. 2022. PEP 8 -- Style Guide for Python Code. Python Software Foundation. https://peps.python.org/pep-0008/ Retrieved 2022 from

[21]

Regex101. 2022. regex101.com. Regex101. https://regex101.com Retrieved 2022 from

[22]

Saul Schleimer, Daniel S. Wilkerson, and Alex Aiken. 2003. Winnowing: local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (San Diego, California) (SIGMOD '03). Association for Computing Machinery, New York, NY, USA, 76--85. https://doi.org/10.1145/872757.872770

Digital Library

[23]

Frank Vahid, Kelly Downey, Ashley Pang, and Chelsea Gordon. 2023. Impact of Several Low-Effort Cheating-Reduction Methods in a CS1 Class. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 486--492. https://doi.org/10.1145/3545945.3569731

Digital Library

[24]

zyBooks. 2022. zyBooks.com. zyBooks. Retrieved December, 2022 from https://www.zybooks.com/

Index Terms

Style Anomalies Can Suggest Cheating in CS1 Programs
1. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Computing education programs
        Computer science education
        CS1

Recommendations

ChatGPT and Cheat Detection in CS1 Using a Program Autograding System
ITiCSE 2024: Proceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1

We experimented with ChatGPT's ability to write programs in a CS1 class, and the ability of a popular tool to auto-detect ChatGPT-written programs. We found ChatGPT was proficient at generating correct programs from a mere copy-paste of the English ...
Performance Analysis and Interviews of Non-CS-Major Students Sanctioned for Cheating in CS1
ITiCSE 2024: Proceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1

College cheating is common, including in computer science (CS) classes like introductory programming (CS1). Much research surveys college students about cheating, but few survey students actually caught cheating, or analyze their performance. We analyzed ...
Style Anomalies Can Suggest Cheating in CS1 Programs
SIGCSE 2024: Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 2

Student cheating on at-home programming assignments is a well-known problem. A key contributor is externally obtained solutions from websites, contractors, and recently generative AI. In our experience, such externally obtained solutions often use coding ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ITiCSE 2024: Proceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1

July 2024

776 pages

ISBN:9798400706004

DOI:10.1145/3649217

General Chairs:
Mattia Monga
University of Milan, Italy
,
Violetta Lonati
University of Milan, Italy
,
Erik Barendsen
Radboud University, The Netherlands
,
Program Chairs:
Judithe Sheard
Monash University, Australia
,
James Paterson
Glasgow Caledonian University, Scotland

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGCSE: ACM Special Interest Group on Computer Science Education

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ITiCSE 2024

Sponsor:

SIGCSE

ITiCSE 2024: Innovation and Technology in Computer Science Education

July 8 - 10, 2024

Milan, Italy

Acceptance Rates

Overall Acceptance Rate 552 of 1,613 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
91
Total Downloads

Downloads (Last 12 months)91
Downloads (Last 6 weeks)26

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents