research-article

Question Independent Grading using Machine Learning: The Case of Computer Program Grading

Authors:

Gursimran Singh,

Shashank Srikant,

Varun AggarwalAuthors Info & Claims

KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 263 - 272

https://doi.org/10.1145/2939672.2939696

Published: 13 August 2016 Publication History

Abstract

Learning supervised models to grade open-ended responses is an expensive process. A model has to be trained for every prompt/question separately, which in turn requires graded samples. In automatic programming evaluation specifically, the focus of this work, this issue is amplified. The models have to be trained not only for every question but also for every language the question is offered in. Moreover, the availability and time taken by experts to create a labeled set of programs for each question is a major bottleneck in scaling such a system. We address this issue by presenting a method to grade computer programs which requires no manually assigned labeled samples for grading responses to a new, unseen question. We extend our previous work [25] wherein we introduced a grammar of features to learn question specific models. In this work, we propose a method to transform those features into a set of features that maintain their structural relation with the labels across questions. Using these features we learn one supervised model, across questions for a given language, which can then be applied to an ungraded response to an unseen question. We show that our method rivals the performance of both, question specific models and the consensus among human experts while substantially outperforming extant ways of evaluating codes. We demonstrate the system single s value by deploying it to grade programs in a high stakes assessment. The learning from this work is transferable to other grading tasks such as math question grading and also provides a new variation to the supervised learning approach.

Supplementary Material

MP4 File (kdd2016_singh_program_grading_01-acm.mp4)

Download
402.65 MB

References

[1]

Automata. Aspiring Minds http://www.aspiringminds.com/technology/automata.

[2]

E-rater. ETS http://www.ets.org/research/topics/as_nlp/writing_quality/.

[3]

Intelli metric. Vantage Learning http://www.vantagelearning.com/products/intellimetric/.

[4]

Speechrater. ETS https://www.ets.org/research/topics/as_nlp/speech/.

[5]

Svar. Aspiring Minds http://www.aspiringminds.com/technology/svar.

[6]

V. Aggarwal, S. Srikant, and V. Shashidhar. Principles for using machine learning in the assessment of open response items: Programming assessment as a case study. In NIPS Workshop on Data Driven Education, 2013.

[7]

J. Baxter. A bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning, 28(1):7--39, 1997.

Digital Library

[8]

J. Bernstein, A. Van Moere, and J. Cheng. Validating automated speaking tests. Language Testing, 2010.

[9]

M. Birenbaum and K. K. Tatsuoka. Open-ended versus multiple-choice response formats-it does make a difference for diagnostic purposes. Applied Psychological Measurement, 11(4):385--395, 1987.

[10]

H. M. Breland. The direct assessment of writing skill: A measurement review. ETS Research Report Series, 1983(2):i--23, 1983.

[11]

J. Burstein, L. Braden-Harder, M. Chodorow, S. Hua, B. Kaplan, K. Kukich, C. Lu, J. Nolan, D. Rock, and S. Wolff. Computer analysis of essay content for automated score prediction: A prototype automated scoring system for gmat analytical writing assessment essays. ETS Research Report Series, 1998(1):i--67, 1998.

[12]

C.-C. Chang and C.-J. Lin. Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011.

Digital Library

[13]

H. Daume III and D. Marcu. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, pages 101--126, 2006.

Digital Library

[14]

E. L. Glassman, J. Scott, R. Singh, P. J. Guo, and R. C. Miller. Overcode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction (TOCHI), 22(2):7, 2015.

Digital Library

[15]

J. Huang, C. Piech, A. Nguyen, and L. Guibas. Syntactic and functional variability of a million code submissions in a machine learning mooc. In AIED 2013 Workshops Proceedings Volume, page 25. Citeseer, 2013.

[16]

A. S. Lan, D. Vats, A. E. Waters, and R. G. Baraniuk. Mathematical language processing: Automatic grading and feedback for open response mathematical questions. In Proceedings of the Second (2015) ACM Conference on Learning@ Scale, pages 167--176. ACM, 2015.

Digital Library

[17]

N. Meinshausen and P. Bühlmann. Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4):417--473, 2010.

[18]

L. Pappano. The year of the mooc. The New York Times (Accessed: 2016--2--2).

[19]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12:2825--2830, 2011.

Digital Library

[20]

K. Rivers and K. R. Koedinger. Automatic generation of programming feedback: A data-driven approach. In The First Workshop on AI-supported Education for Computer Science (AIEDCS 2013), page 50, 2013.

[21]

V. Shashidhar, N. Pandey, and V. Aggarwal. Automatic spontaneous speech grading: A novel feature derivation technique using the crowd. In Proceedings of the Conference of the Association for Computational Linguistics. ACL, 2015.

[22]

V. Shashidhar, N. Pandey, and V. Aggarwal. Spoken english grading: Machine learning with crowd intelligence. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2089--2097. ACM, 2015.

Digital Library

[23]

R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In ACM SIGPLAN Notices, volume 48, pages 15--26. ACM, 2013.

Digital Library

[24]

V. Southavilay, K. Yacef, P. Reimann, and R. A. Calvo. Analysis of collaborative writing processes using revision maps and probabilistic topic models. In Proceedings of the Third International Conference on Learning Analytics and Knowledge, pages 38--47. ACM, 2013.

Digital Library

[25]

S. Srikant and V. Aggarwal. A system to grade computer programming skills using machine learning. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1887--1896. ACM, 2014.

Digital Library

[26]

S. Thrun. Is learning the n-th thing any easier than learning the first? Advances in neural information processing systems, pages 640--646, 1996.

[27]

C. Vleuten, G. Norman, and E. Graaff. Pitfalls in the pursuit of objectivity: issues of reliability. Medical education, 25(2):110--118, 1991.

Cited By

Messer MBrown NKölling MShi M(2024)Automated Grading and Feedback Tools for Programming Education: A Systematic ReviewACM Transactions on Computing Education10.1145/363651524:1(1-43)Online publication date: 19-Feb-2024
https://dl.acm.org/doi/10.1145/3636515
Ali MRomli RKareem E(2024)Feedback Generation for Automatic Programming Assessment Utilizing AI Techniques: An Initial Analysis of Systematic Mapping StudiesAdvances in Intelligent Computing Techniques and Applications10.1007/978-3-031-59711-4_23(257-272)Online publication date: 30-Jun-2024
https://doi.org/10.1007/978-3-031-59711-4_23
Mallik SGangopadhyay A(2023)Proactive and reactive engagement of artificial intelligence methods for education: a reviewFrontiers in Artificial Intelligence10.3389/frai.2023.11513916Online publication date: 5-May-2023
https://doi.org/10.3389/frai.2023.1151391
Show More Cited By

Index Terms

Question Independent Grading using Machine Learning: The Case of Computer Program Grading

Recommendations

A system to grade computer programming skills using machine learning
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

The automatic evaluation of computer programs is a nascent area of research with a potential for large-scale impact. Extant program assessment systems score mostly based on the number of test-cases passed, providing no insight into the competency of the ...
An Exploration of Automated Grading of Complex Assignments
L@S '16: Proceedings of the Third (2016) ACM Conference on Learning @ Scale

Automated grading is essential for scaling up learning. In this paper, we conduct the first systematic study of how to automate grading of a complex assignment using a medical case assessment as a test case. We propose to solve this problem using a ...
A comparison of computer-assisted cooperative learning with independent learning

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 2016

2176 pages

ISBN:9781450342322

DOI:10.1145/2939672

General Chairs:
Balaji Krishnapuram
IBM
,
Mohak Shah
Bosch
,
Program Chairs:
Alex Smola
Amazon
,
Charu Aggarwal
IBM
,
Dou Shen
Baidu
,
Rajeev Rastogi
Amazon

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '16

Sponsor:

KDD '16: The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 13 - 17, 2016

California, San Francisco, USA

Acceptance Rates

KDD '16 Paper Acceptance Rate 66 of 1,115 submissions, 6%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
485
Total Downloads

Downloads (Last 12 months)35
Downloads (Last 6 weeks)5

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Messer MBrown NKölling MShi M(2024)Automated Grading and Feedback Tools for Programming Education: A Systematic ReviewACM Transactions on Computing Education10.1145/363651524:1(1-43)Online publication date: 19-Feb-2024
https://dl.acm.org/doi/10.1145/3636515
Ali MRomli RKareem E(2024)Feedback Generation for Automatic Programming Assessment Utilizing AI Techniques: An Initial Analysis of Systematic Mapping StudiesAdvances in Intelligent Computing Techniques and Applications10.1007/978-3-031-59711-4_23(257-272)Online publication date: 30-Jun-2024
https://doi.org/10.1007/978-3-031-59711-4_23
Mallik SGangopadhyay A(2023)Proactive and reactive engagement of artificial intelligence methods for education: a reviewFrontiers in Artificial Intelligence10.3389/frai.2023.11513916Online publication date: 5-May-2023
https://doi.org/10.3389/frai.2023.1151391
Messer MBrown NKölling MShi MLaakso MMonga MSimon Sheard J(2023)Machine Learning-Based Automated Grading and Feedback Tools for Programming: A Meta-AnalysisProceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 110.1145/3587102.3588822(491-497)Online publication date: 29-Jun-2023
https://dl.acm.org/doi/10.1145/3587102.3588822
Zhang AChen YOney S(2023)VizProg: Identifying Misunderstandings By Visualizing Students’ Coding ProgressProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581516(1-16)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581516
Narmada NPati P(2023)Autograding of Programming Skills2023 IEEE 8th International Conference for Convergence in Technology (I2CT)10.1109/I2CT57861.2023.10126211(1-6)Online publication date: 7-Apr-2023
https://doi.org/10.1109/I2CT57861.2023.10126211
Nagy PDavoudi H(2023)Towards Deep Learning Models for Automatic Computer Program Grading2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA60987.2023.10302571(1-10)Online publication date: 9-Oct-2023
https://doi.org/10.1109/DSAA60987.2023.10302571
Tisha SOregon RBaumgartner GAlegre FMoreno J(2022)An automatic grading system for a high school-level computational thinking courseProceedings of the 4th International Workshop on Software Engineering Education for the Next Generation10.1145/3528231.3528357(20-27)Online publication date: 17-May-2022
https://dl.acm.org/doi/10.1145/3528231.3528357
Chourasia PRamakrishnan GApte VKumar SRastogi ATufano RBavota GArnaoudova VHaiduc S(2022)Algorithm identification in programming assignmentsProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527914(471-481)Online publication date: 16-May-2022
https://dl.acm.org/doi/10.1145/3524610.3527914
Wang DZhang ELu X(2022)Automatic Grading of Student Code with Similarity MeasurementMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26422-1_18(286-301)Online publication date: 19-Sep-2022
https://dl.acm.org/doi/10.1007/978-3-031-26422-1_18
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten