research-article

A feasibility study of using automated program repair for introductory programming assignments

Authors:

Umair Z. Ahmed,

Abhik RoychoudhuryAuthors Info & Claims

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

Pages 740 - 751

https://doi.org/10.1145/3106237.3106262

Published: 21 August 2017 Publication History

Abstract

Despite the fact an intelligent tutoring system for programming (ITSP) education has long attracted interest, its widespread use has been hindered by the difficulty of generating personalized feedback automatically. Meanwhile, automated program repair (APR) is an emerging new technology that automatically fixes software bugs, and it has been shown that APR can fix the bugs of large real-world software. In this paper, we study the feasibility of marrying intelligent programming tutoring and APR. We perform our feasibility study with four state-of-the-art APR tools (GenProg, AE, Angelix, and Prophet), and 661 programs written by the students taking an introductory programming course. We found that when APR tools are used out of the box, only about 30% of the programs in our dataset are repaired. This low repair rate is largely due to the student programs often being significantly incorrect - in contrast, professional software for which APR was successfully applied typically fails only a small portion of tests. To bridge this gap, we adopt in APR a new repair policy akin to the hint generation policy employed in the existing ITSP. This new repair policy admits partial repairs that address part of failing tests, which results in 84% improvement of repair rate. We also performed a user study with 263 novice students and 37 graders, and identified an understudied problem; while novice students do not seem to know how to effectively make use of generated repairs as hints, the graders do seem to gain benefits from repairs.

References

[1]

Anne Adam and Jean-Pierre H. Laurent. 1980.

[2]

LAURA, A System to Debug Student Programs. Artif. Intell. 15, 1-2 (1980), 75–122.

Digital Library

[3]

Tiffany Barnes and John C. Stamper. 2008. Toward Automatic Hint Generation for Logic Proof Tutoring Using Historical Student Data. In Intelligent Tutoring Systems. 373–382.

Digital Library

[4]

Geoff Birch, Bernd Fischer, and Michael Poppleton. 2016.

[5]

Using Fast Model-Based Fault Localisation to Aid Students in Self-Guided Program Repair and to Improve Assessment. In Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education, ITiCSE 2016. 168–173.

Digital Library

[6]

Loris D’Antoni, Roopsha Samanta, and Rishabh Singh. 2016.

[7]

Qlose: Program Repair with Quantitative Objectives. In CAV. 383–401.

[8]

Rajdeep Das, Umair Z. Ahmed, Amey Karkare, and Sumit Gulwani. 2016. Prutor: A System for Tutoring CS1 and Collecting Student Programs for Analysis. CoRR abs/1608.03828 (2016). http://arxiv.org/abs/1608.03828

[9]

Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In ASE. 313–324.

Digital Library

[10]

Elena L. Glassman, Jeremy Scott, Rishabh Singh, Philip J. Guo, and Robert C. Miller. 2015. OverCode: Visualizing Variation in Student Solutions to Programming Problems at Scale. ACM Trans. Comput.-Hum. Interact. 22, 2 (2015), 7:1–7:35.

Digital Library

[11]

Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In ICSE. 3–13.

Digital Library

[12]

Sebastian Gross, Bassam Mokbel, Benjamin Paaßen, Barbara Hammer, and Niels Pinkwart. 2014. Example-based feedback provision using structured solution spaces. IJLT 9, 3 (2014), 248–280.

Digital Library

[13]

Zhongxian Gu, Earl T. Barr, David J. Hamilton, and Zhendong Su. 2010. Has the Bug Really Been Fixed?. In ICSE. 55–64.

Digital Library

[14]

Sumit Gulwani, Ivan Radicek, and Florian Zuleger. 2014. Feedback generation for performance problems in introductory programming assignments. In FSE. 41–51.

Digital Library

[15]

Philip J. Guo. 2015. Codeopticon: Real-Time, One-To-Many Human Tutoring for Computer Programming. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, UIST 2015, Charlotte, NC, USA, November 8-11, 2015. 599–608.

Digital Library

[16]

Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. DeepFix: Fixing Common C Language Errors by Deep Learning. In AAAI. 1345–1351.

[17]

Shalini Kaleeswaran, Anirudh Santhiar, Aditya Kanade, and Sumit Gulwani. 2016.

[18]

Semi-supervised verified feedback generation. In FSE. 739–750.

[19]

Shalini Kaleeswaran, Varun Tulsian, Aditya Kanade, and Alessandro Orso. 2014.

[20]

MintHint: automated synthesis of repair hints. In ICSE. 266–276.

[21]

Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic Patch Generation Learned from Human-written Patches. In ICSE. 802–811.

Digital Library

[22]

John R. Koza. 1993.

[23]

Genetic programming - on the programming of computers by means of natural selection. MIT Press.

Digital Library

[24]

Xuan-Bach D. Le, David Lo, and Claire Le Goues. 2016. History Driven Program Repair. In SANER. 213–224.

[25]

C. Le Goues, ThanhVu Nguyen, S. Forrest, and W. Weimer. 2012. GenProg: A Generic Method for Automatic Software Repair. IEEE Transactions on Software Engineering 38, 1 (Jan 2012), 54–72.

Digital Library

[26]

Fan Long and Martin Rinard. 2015. Staged program repair with condition synthesis. In ESEC/FSE. 166–178.

Digital Library

[27]

Fan Long and Martin Rinard. 2016.

[28]

Automatic patch generation by learning correct code. In POPL. 298–312.

[29]

Fan Long and Martin C. Rinard. 2016.

[30]

An analysis of the search spaces for generate and validate patch generation systems. In ICSE. 702–713.

[31]

Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2015. DirectFix: Looking for Simple Program Repairs. In ICSE. 448–458.

Digital Library

[32]

Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: scalable multiline program patch synthesis via symbolic analysis. In ICSE. 691–701.

Digital Library

[33]

Na Meng, Miryung Kim, and Kathryn S. McKinley. 2011.

[34]

Systematic editing: generating program transformations from an example. In PLDI. 329–342.

[35]

Na Meng, Miryung Kim, and Kathryn S. McKinley. 2013.

[36]

LASE: locating and applying systematic edits by learning from examples. In ICSE. 502–511.

[37]

Douglas C. Merrill, Brian J. Reiser, Shannon K. Merrill, and Shari Landes. 1995.

[38]

Tutoring: Guided Learning by Doing. Cognition and Instruction 13, 3 (1995), 315–372.

[39]

Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: program repair via semantic analysis. In ICSE. 772–781.

[40]

Luc Paquette, Jean-François Lebeau, Gabriel Beaulieu, and André Mayers. 2012.

[41]

Automating Next-Step Hints Generation Using ASTUS. In Intelligent Tutoring Systems. 201–211. ESEC/FSE’17, September 4–8, 2017, Paderborn, Germany Jooyong Yi, Umair Z. Ahmed, Amey Karkare, Shin Hwei Tan, and Abhik Roychoudhury

Digital Library

[42]

Chris Parnin and Alessandro Orso. 2011. Are automated debugging techniques actually helping programmers?. In ISSTA. 199–209.

Digital Library

[43]

Yu Pei, C.A. Furia, M. Nordio, Yi Wei, B. Meyer, and A. Zeller. 2014. Automated Fixing of Programs with Contracts. IEEE Transactions on Software Engineering 40, 5 (May 2014), 427–449.

Digital Library

[44]

Chris Piech, Mehran Sahami, Jonathan Huang, and Leonidas Guibas. 2015. Autonomously Generating Hints by Inferring Problem Solving Policies. In Proceedings of the Second ACM Conference on Learning @ Scale. 195–204.

Digital Library

[45]

Leena M. Razzaq, Neil T. Heffernan, and Robert W. Lindeman. 2007. What Level of Tutor Interaction is Best?. In Artificial Intelligence in Education. 222–229.

Digital Library

[46]

Kelly Rivers and Kenneth R. Koedinger. 2013. Automatic Generation of Programming Feedback; A Data-Driven Approach. In Proceedings of the Workshops at the 16th International Conference on Artificial Intelligence in Education AIED 2013, Memphis, USA, July 9-13, 2013.

[47]

Kelly Rivers and Kenneth R. Koedinger. 2014. Automating Hint Generation with Solution Space Path Construction. In Intelligent Tutoring Systems. 329–339.

Digital Library

[48]

Kelly Rivers and Kenneth R. Koedinger. 2017. Data-Driven Hint Generation in Vast Solution Spaces: a Self-Improving Python Programming Tutor. International Journal of Artificial Intelligence in Education 27, 1 (2017), 37–64.

[49]

Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Bjoern Hartmann. 2017. Learning Syntactic Program Transformations from Examples. In ICSE. 404–415.

Digital Library

[50]

Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated feedback generation for introductory programming assignments. In PLDI. 15–26.

Digital Library

[51]

Edward K. Smith, Earl T. Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? overfitting in automated program repair. In ESEC/FSE. 532–543.

Digital Library

[52]

Elliot Soloway, Beverly Park Woolf, Eric Rubin, and Paul Barth. 1981. Meno-II: An Intelligent Tutoring System for Novice Programmers. In IJCAI. 975–977.

Digital Library

[53]

Shin Hwei Tan and Abhik Roychoudhury. 2015.

[54]

relifix: Automated repair of software regressions. In ICSE. 471–482.

[55]

Shin Hwei Tan, Hiroaki Yoshida, Mukul R Prasad, and Abhik Roychoudhury. 2016. Anti-patterns in search-based program repair. In FSE. 727–738.

[56]

Yida Tao, Jindae Kim, Sunghun Kim, and Chang Xu. 2014. Automatically generated patches as debugging aids: a human study. In FSE. 64–74.

Digital Library

[57]

Westley Weimer, Zachary P. Fry, and Stephanie Forrest. 2013. Leveraging program equivalence for adaptive program repair: Models and first results. In ASE. 356–366.

Digital Library

[58]

Yingfei Xiong, Jie Wang, Runfa Yan, Jiachen Zhang, Shi Han, Gang Huang, and Lu Zhang. 2017. Precise Condition Synthesis for Program Repair. In ICSE. 416–426.

Digital Library

[59]

Songwen Xu and Yam San Chee. 2003.

[60]

Transformation-Based Diagnosis of Student Programs for Programming Tutoring Systems. IEEE Trans. Software Eng. 29, 4 (2003), 360–384.

Digital Library

[61]

Jifeng Xuan, Matias Martinez, Favio Demarco, Maxime Clement, Sebastian R. Lamelas Marcote, Thomas Durieux, Daniel Le Berre, and Martin Monperrus. 2017.

Cited By

Paiva JLeal JFigueira Á(2025)Incremental Repair Feedback on Automated Assessment of Programming AssignmentsElectronics10.3390/electronics1404081914:4(819)Online publication date: 19-Feb-2025
https://doi.org/10.3390/electronics14040819
Ahmed USahai SLeong BKarkare AStone JYuen TShoop LRebelsky SPrather J(2025)Feasibility Study of Augmenting Teaching Assistants with AI for CS1 Programming FeedbackProceedings of the 56th ACM Technical Symposium on Computer Science Education V. 110.1145/3641554.3701972(11-17)Online publication date: 12-Feb-2025
https://dl.acm.org/doi/10.1145/3641554.3701972
Zhang JWang CLi AWang WLi TLiu YFilkov VRay BZhou M(2024)VulAdvisor: Natural Language Suggestion Generation for Software Vulnerability RepairProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695555(1932-1944)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695555
Show More Cited By

Index Terms

A feasibility study of using automated program repair for introductory programming assignments
1. Applied computing
  1. Education
    1. Computer-assisted instruction
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Automated clustering and program repair for introductory programming assignments
PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation

Providing feedback on programming assignments is a tedious task for the instructor, and even impossible in large Massive Open Online Courses with thousands of students. Previous research has suggested that program repair techniques can be used to ...
Automated clustering and program repair for introductory programming assignments
PLDI '18

Providing feedback on programming assignments is a tedious task for the instructor, and even impossible in large Massive Open Online Courses with thousands of students. Previous research has suggested that program repair techniques can be used to ...
Verifix: Verified Repair of Programming Assignments
Automated feedback generation for introductory programming assignments is useful for programming education. Most works try to generate feedback to correct a student program by comparing its behavior with an instructor’s reference program on selected ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

August 2017

1073 pages

ISBN:9781450351058

DOI:10.1145/3106237

General Chairs:
Eric Bodden
Paderborn University, Germany / Fraunhofer IEM, Germany
,
Wilhelm Schäfer
Paderborn University, Germany
,
Program Chairs:
Arie van Deursen
Delft University of Technology, Netherlands
,
Andrea Zisman
Open University, UK

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Artifacts Evaluated & Functional

Author Tags

Qualifiers

Research-article

Conference

ESEC/FSE'17

Sponsor:

SIGSOFT

ESEC/FSE'17: Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering

September 4 - 8, 2017

Paderborn, Germany

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

88
Total Citations
View Citations
931
Total Downloads

Downloads (Last 12 months)78
Downloads (Last 6 weeks)4

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Paiva JLeal JFigueira Á(2025)Incremental Repair Feedback on Automated Assessment of Programming AssignmentsElectronics10.3390/electronics1404081914:4(819)Online publication date: 19-Feb-2025
https://doi.org/10.3390/electronics14040819
Ahmed USahai SLeong BKarkare AStone JYuen TShoop LRebelsky SPrather J(2025)Feasibility Study of Augmenting Teaching Assistants with AI for CS1 Programming FeedbackProceedings of the 56th ACM Technical Symposium on Computer Science Education V. 110.1145/3641554.3701972(11-17)Online publication date: 12-Feb-2025
https://dl.acm.org/doi/10.1145/3641554.3701972
Zhang JWang CLi AWang WLi TLiu YFilkov VRay BZhou M(2024)VulAdvisor: Natural Language Suggestion Generation for Software Vulnerability RepairProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695555(1932-1944)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695555
Liu FLiu ZZhao QJiang JZhang LSun ZLi GLi ZMa YFilkov VRay BZhou M(2024)FastFixer: An Efficient and Effective Approach for Repairing Programming AssignmentsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695062(669-680)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695062
Xin QWu HTang JLiu XReiss SXuan J(2024)Detecting, Creating, Repairing, and Understanding Indivisible Multi-Hunk BugsProceedings of the ACM on Software Engineering10.1145/36608281:FSE(2747-2770)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660828
Yang BTian HPian WYu HWang HKlein JBissyandé TJin SChristakis MPradel M(2024)CREF: An LLM-Based Conversational Software Repair Framework for Programming TutorsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680328(882-894)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680328
Xie LLi CPei YZhang TPan MChristakis MPradel M(2024)BRAFAR: Bidirectional Refactoring, Alignment, Fault Localization, and Repair for Programming AssignmentsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680326(856-868)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680326
Zhang JCambronero JGulwani SLe VPiskac RSoares GVerbruggen G(2024)PyDex: Repairing Bugs in Introductory Python Assignments using LLMsProceedings of the ACM on Programming Languages10.1145/36498508:OOPSLA1(1100-1124)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649850
Beyer DGrunske LKettl MLingsch-Rosenfeld MRaselimo MSpinellis DConstantinou EBacchelli A(2024)P3: A Dataset of Partial Program FixesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644889(123-127)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644889
S Kumar SAdam Lones MMaarek MZantout H(2024)Investigating the Proficiency of Large Language Models in Formative Feedback Generation for Student ProgrammersProceedings of the 1st International Workshop on Large Language Models for Code10.1145/3643795.3648380(88-93)Online publication date: 20-Apr-2024
https://dl.acm.org/doi/10.1145/3643795.3648380
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten