research-article

Open access

Automated clustering and program repair for introductory programming assignments

Authors:

Florian ZulegerAuthors Info & Claims

PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pages 465 - 480

https://doi.org/10.1145/3192366.3192387

Published: 11 June 2018 Publication History

Abstract

Providing feedback on programming assignments is a tedious task for the instructor, and even impossible in large Massive Open Online Courses with thousands of students. Previous research has suggested that program repair techniques can be used to generate feedback in programming education. In this paper, we present a novel fully automated program repair algorithm for introductory programming assignments. The key idea of the technique, which enables automation and scalability, is to use the existing correct student solutions to repair the incorrect attempts. We evaluate the approach in two experiments: (I) We evaluate the number, size and quality of the generated repairs on 4,293 incorrect student attempts from an existing MOOC. We find that our approach can repair 97% of student attempts, while 81% of those are small repairs of good quality. (II) We conduct a preliminary user study on performance and repair usefulness in an interactive teaching setting. We obtain promising initial results (the average usefulness grade 3.4 on a scale from 1 to 5), and conclude that our approach can be used in an interactive setting.

Supplementary Material

WEBM File (p465-gulwani.webm)

Download
126.98 MB

References

[1]

Anne Adam and Jean-Pierre Laurent. 1980. LAURA, a system to debug student programs. Artificial Intelligence 15, 1âĂŞ2 (1980), 75 – 122.

Digital Library

[2]

Andrea Arcuri. 2008. On the Automation of Fixing Software Bugs. In Companion of the 30th International Conference on Software Engineering (ICSE Companion ’08) . ACM, New York, NY, USA, 1003–1006.

Digital Library

[3]

D. Beyer, A. Cimatti, A. Griggio, M. E. Keremoglu, S. F. University, and R. Sebastiani. 2009. Software model checking via large-block encoding. In 2009 Formal Methods in Computer-Aided Design. 25–32.

[4]

Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodik. 2011. Angelic Debugging. In Proceedings of the 33rd International Conference on Software Engineering (ICSE ’11) . ACM, New York, NY, USA, 121–130.

Digital Library

[5]

Loris D’Antoni, Roopsha Samanta, and Rishabh Singh. 2016. Qlose: Program Repair with Quantitative Objectives. In Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part II . 383–401.

[6]

Rajdeep Das, Umair Z. Ahmed, Amey Karkare, and Sumit Gulwani. 2016. Prutor: A System for Tutoring CS1 and Collecting Student Programs for Analysis. CoRR abs/1608.03828 (2016). http://arxiv.org/ abs/1608.03828

[7]

V. Debroy and W.E. Wong. 2010. Using Mutation to Automatically Suggest Fixes for Faulty Programs. In Software Testing, Verification and Validation (ICST), 2010 Third International Conference on . 65–74.

Digital Library

[8]

Yulia Demyanova, Helmut Veith, and Florian Zuleger. 2013. On the concept of variable roles and its use in software analysis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Portland, OR, USA, October 20-23, 2013 . 226–230. http://ieeexplore.ieee.org/xpl/freeabs_all. jsp?arnumber=6679414

[9]

A. Drummond, Y. Lu, S. Chaudhuri, C. Jermaine, J. Warren, and S. Rixner. 2014. Learning to Grade Student Programs in a Massive Open Online Course. In Data Mining (ICDM), 2014 IEEE International Conference on . 785–790.

Digital Library

[10]

Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A Genetic Programming Approach to Automated Software Repair. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO ’09) . ACM, New York, NY, USA, 947–954.

Digital Library

[11]

Elena L. Glassman, Jeremy Scott, Rishabh Singh, Philip Guo, and Robert Miller. 2014. OverCode: Visualizing Variation in Student Solutions to Programming Problems at Scale. In Proceedings of the Adjunct Publication of the 27th Annual ACM Symposium on User Interface Software and Technology (UIST’14 Adjunct) . ACM, New York, NY, USA, 129–130.

Digital Library

[12]

Divya Gopinath, Muhammad Zubair Malik, and Sarfraz Khurshid. 2011. Specification-based Program Repair Using SAT. In Proceedings of the 17th International Conference on Tools and Algorithms for the Construction and Analysis of Systems: Part of the Joint European Conferences on Theory and Practice of Software (TACAS’11/ETAPS’11) . SpringerVerlag, Berlin, Heidelberg, 173–188. http://dl.acm.org/citation.cfm? id=1987389.1987408

Digital Library

[13]

C. Le Goues, N. Holtschulte, E. K. Smith, Y. Brun, P. Devanbu, S. Forrest, and W. Weimer. 2015. The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs. IEEE Transactions on Software Engineering 41, 12 (Dec 2015), 1236–1256.

Digital Library

[14]

Sumit Gulwani, Ivan Radiček, and Florian Zuleger. 2014. Feedback Generation for Performance Problems in Introductory Programming Assignments. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014) . ACM, New York, NY, USA, 41–51.

Digital Library

[15]

Sumit Gulwani, Ivan Radiček, and Florian Zuleger. 2018. Automated Clustering and Program Repair for Introductory Programming Assignments. CoRR abs/1603.03165 (2018). arXiv: 1603.03165 http://arxiv.org/abs/1603.03165

Digital Library

[16]

Andrew Head, Elena Glassman, Gustavo Soares, Ryo Suzuki, Lucas Figueredo, Loris D’Antoni, and Björn Hartmann. 2017. Writing Reusable Code Feedback at Scale with Mixed-Initiative Program Synthesis. In Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale (L@S ’17) . ACM, New York, NY, USA, 89–98.

Digital Library

[17]

Petri Ihantola, Tuukka Ahoniemi, Ville Karavirta, and Otto Seppälä. 2010. Review of Recent Systems for Automatic Assessment of Programming Assignments. In Proceedings of the 10th Koli Calling International Conference on Computing Education Research (Koli Calling ’10) . ACM, New York, NY, USA, 86–93.

Digital Library

[18]

Barbara Jobstmann, Andreas Griesmayer, and Roderick Bloem. 2005. Program Repair As a Game. In Proceedings of the 17th International Conference on Computer Aided Verification (CAV’05) . Springer-Verlag, Berlin, Heidelberg, 226–238.

Digital Library

[19]

Shalini Kaleeswaran, Anirudh Santhiar, Aditya Kanade, and Sumit Gulwani. 2016. Semi-supervised Verified Feedback Generation. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016) . ACM, New York, NY, USA, 739–750.

Digital Library

[20]

Yalin Ke, Kathryn T. Stolee, Claire Le Goues, and Yuriy Brun. 2015. Repairing Programs with Semantic Code Search (T). In Proceedings of the 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) (ASE ’15) . IEEE Computer Society, Washington, DC, USA, 295–306.

Digital Library

[21]

Dohyeong Kim, Yonghwi Kwon, Peng Liu, I. Luk Kim, David Mitchel Perry, Xiangyu Zhang, and Gustavo Rodriguez-Rivera. 2016. Apex: Automatic Programming Assignment Error Explanation. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016) . ACM, New York, NY, USA, 311–327.

Digital Library

[22]

Robert Könighofer and Roderick Bloem. 2011. Automated Error Localization and Correction for Imperative Programs. In Proceedings of the International Conference on Formal Methods in ComputerAided Design (FMCAD ’11) . FMCAD Inc, Austin, TX, 91–100. http: //dl.acm.org/citation.cfm?id=2157654.2157671

Digital Library

[23]

Fan Long and Martin Rinard. 2016. Automatic Patch Generation by Learning Correct Code. SIGPLAN Not. 51, 1 (Jan. 2016), 298–312.

Digital Library

[24]

Ken Masters. 2011. A Brief Guide To Understanding MOOCs. The Internet Journal of Medical Education 1, 2 (2011).

[25]

Robin Milner. 1971. An Algebraic Definition of Simulation Between Programs . Technical Report. Stanford, CA, USA.

[26]

Andy Nguyen, Christopher Piech, Jonathan Huang, and Leonidas Guibas. 2014. Codewebs: Scalable Homework Search for Massive Open Online Programming Courses. In Proceedings of the 23rd International Conference on World Wide Web (WWW ’14) . ACM, New York, NY, USA, 491–502.

Digital Library

[27]

Kelly Rivers and Kenneth R. Koedinger. 2017. Data-Driven Hint Generation in Vast Solution Spaces: a Self-Improving Python Programming Tutor. International Journal of Artificial Intelligence in Education 27, 1 (01 Mar 2017), 37–64.

[28]

Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Björn Hartmann. 2017. Learning Syntactic Program Transformations from Examples. In Proceedings of the 39th International Conference on Software Engineering (ICSE ’17) . IEEE Press, Piscataway, NJ, USA, 404–415.

Digital Library

[29]

Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated Feedback Generation for Introductory Programming Assignments. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’13) . ACM, New York, NY, USA, 15–26.

Digital Library

[30]

Stefan Staber, Barbara Jobstmann, and Roderick Bloem. 2005. Finding and Fixing Faults. In Correct Hardware Design and Verification Methods, Dominique Borrione and Wolfgang Paul (Eds.). Lecture Notes in Computer Science, Vol. 3725. Springer Berlin Heidelberg, 35–49.

Digital Library

[31]

Michael Striewe and Michael Goedicke. 2011. Using run time traces in automated programming tutoring. In ITiCSE. 303–307.

Digital Library

[32]

Michael Striewe and Michael Goedicke. 2013. Trace Alignment for Automated Tutoring. In CAA.

[33]

Ryo Suzuki, Gustavo Soares, Elena Glassman, Andrew Head, Loris D’Antoni, and Björn Hartmann. 2017. Exploring the Design Space of Automatically Synthesized Hints for Introductory Programming Assignments. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA ’17) . ACM, New York, NY, USA, 2951–2958.

Digital Library

[34]

Kuo-Chung Tai. 1979. The Tree-to-Tree Correction Problem. J. ACM 26, 3 (July 1979), 422–433.

Digital Library

[35]

Nikolai Tillmann, Judith Bishop, R. Nigel Horspool, Daniel Perelman, and Tao Xie. 2014. Code Hunt: Searching for Secret Code for Fun. Proceedings of the International Conference on Software Engineering (Workshops) (June 2014). http://research.microsoft.com/apps/pubs/ default.aspx?id=210651

Digital Library

[36]

Nikolai Tillmann, Jonathan De Halleux, Tao Xie, Sumit Gulwani, and Judith Bishop. 2013. Teaching and Learning Programming and Software Engineering via Interactive Gaming. In Proc. 35th International Conference on Software Engineering (ICSE 2013), Software Engineering Education (SEE) . http://www.cs.illinois.edu/homes/taoxie/publications/ icse13see-pex4fun.pdf

Digital Library

[37]

Takeaki Uno. 1997. Algorithms for Enumerating All Perfect, Maximum and Maximal Matchings in Bipartite Graphs. In ISAAC. 92–101.

Digital Library

[38]

Jooyong Yi, Umair Z. Ahmed, Amey Karkare, Shin Hwei Tan, and Abhik Roychoudhury. 2017. A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017) . ACM, New York, NY, USA, 740–751.

Digital Library

[39]

K. Zhang and D. Shasha. 1989. Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems. SIAM J. Comput. 18, 6 (Dec. 1989), 1245–1262.

Digital Library

Cited By

Paiva JLeal JFigueira Á(2024)Comparing semantic graph representations of source code: The case of automatic feedback on programming assignmentsComputer Science and Information Systems10.2298/CSIS230615004P21:1(117-142)Online publication date: 2024
https://doi.org/10.2298/CSIS230615004P
Hu YGilad AStephens-Martinez KRoy SYang J(2024)Qr-Hint: Actionable Hints Towards Correcting Wrong SQL QueriesProceedings of the ACM on Management of Data10.1145/36549952:3(1-27)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654995
Zhang JCambronero JGulwani SLe VPiskac RSoares GVerbruggen G(2024)PyDex: Repairing Bugs in Introductory Python Assignments using LLMsProceedings of the ACM on Programming Languages10.1145/36498508:OOPSLA1(1100-1124)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649850
Show More Cited By

Index Terms

Automated clustering and program repair for introductory programming assignments
1. Applied computing
  1. Education
    1. Computer-assisted instruction
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Automated clustering and program repair for introductory programming assignments
PLDI '18

Providing feedback on programming assignments is a tedious task for the instructor, and even impossible in large Massive Open Online Courses with thousands of students. Previous research has suggested that program repair techniques can be used to ...
A feasibility study of using automated program repair for introductory programming assignments
ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

Despite the fact an intelligent tutoring system for programming (ITSP) education has long attracted interest, its widespread use has been hindered by the difficulty of generating personalized feedback automatically. Meanwhile, automated program repair (...
Automated Program Repair for Introductory Programming Assignments via Bidirectional Refactoring
APR '24: Proceedings of the 5th ACM/IEEE International Workshop on Automated Program Repair

The development of programming education has given rise to automated program repair techniques tailored for introductory programming assignments (IPAs). Despite the promising performance of mainstream automated feedback generation systems, they still ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation

June 2018

825 pages

ISBN:9781450356985

DOI:10.1145/3192366

General Chair:
Jeffrey S. Foster
University of Maryland at College Park, USA
,
Program Chair:
Dan Grossman
University of Washington, USA

ACM SIGPLAN Notices Volume 53, Issue 4
PLDI '18
April 2018
834 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3296979
Editor:
Matthew Fluet
Rodchester Institude of Technology
Issue’s Table of Contents

Copyright © 2018 Owner/Author.

This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2018

Check for updates

Author Tags

Qualifiers

Research-article

Conference

PLDI '18

Sponsor:

SIGPLAN

PLDI '18: ACM SIGPLAN Conference on Programming Language Design and Implementation

June 18 - 22, 2018

PA, Philadelphia, USA

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

59
Total Citations
View Citations
2,860
Total Downloads

Downloads (Last 12 months)417
Downloads (Last 6 weeks)48

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Paiva JLeal JFigueira Á(2024)Comparing semantic graph representations of source code: The case of automatic feedback on programming assignmentsComputer Science and Information Systems10.2298/CSIS230615004P21:1(117-142)Online publication date: 2024
https://doi.org/10.2298/CSIS230615004P
Hu YGilad AStephens-Martinez KRoy SYang J(2024)Qr-Hint: Actionable Hints Towards Correcting Wrong SQL QueriesProceedings of the ACM on Management of Data10.1145/36549952:3(1-27)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654995
Zhang JCambronero JGulwani SLe VPiskac RSoares GVerbruggen G(2024)PyDex: Repairing Bugs in Introductory Python Assignments using LLMsProceedings of the ACM on Programming Languages10.1145/36498508:OOPSLA1(1100-1124)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649850
Orvalho PJanota MManquinho VHuyen PTan SMechtaev SKhurshid S(2024)C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming AssignmentsProceedings of the 5th ACM/IEEE International Workshop on Automated Program Repair10.1145/3643788.3648010(14-21)Online publication date: 20-Apr-2024
https://dl.acm.org/doi/10.1145/3643788.3648010
Ishizue RSakamoto KWashizaki HFukazawa YStephenson BStone JBattestilli LRebelsky SShoop L(2024)Improved Program Repair Methods using Refactoring with GPT ModelsProceedings of the 55th ACM Technical Symposium on Computer Science Education V. 110.1145/3626252.3630875(569-575)Online publication date: 7-Mar-2024
https://dl.acm.org/doi/10.1145/3626252.3630875
Eladawy HLe Goues CBrun YRoychoudhury APaiva AAbreu RStorey M(2024)Automated Program Repair, What Is It Good For? Not Absolutely Nothing!Proceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639095(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639095
Wan HLuo HLi MLuo X(2024)Automated Program Repair for Introductory Programming AssignmentsIEEE Transactions on Learning Technologies10.1109/TLT.2024.340371017(1745-1760)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1109/TLT.2024.3403710
Yang DHe JMao XLi TLei YYi XWu J(2024)Strider: Signal Value Transition-Guided Defect Repair for HDL Programming AssignmentsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.334175043:5(1594-1607)Online publication date: May-2024
https://doi.org/10.1109/TCAD.2023.3341750
Ali MRomli RKareem E(2024)Feedback Generation for Automatic Programming Assessment Utilizing AI Techniques: An Initial Analysis of Systematic Mapping StudiesAdvances in Intelligent Computing Techniques and Applications10.1007/978-3-031-59711-4_23(257-272)Online publication date: 30-Jun-2024
https://doi.org/10.1007/978-3-031-59711-4_23
First ERabe MRinger TBrun YChandra SBlincoe KTonella P(2023)Baldur: Whole-Proof Generation and Repair with Large Language ModelsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616243(1229-1241)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616243
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents