Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2939672.2939696acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Question Independent Grading using Machine Learning: The Case of Computer Program Grading

Published: 13 August 2016 Publication History

Abstract

Learning supervised models to grade open-ended responses is an expensive process. A model has to be trained for every prompt/question separately, which in turn requires graded samples. In automatic programming evaluation specifically, the focus of this work, this issue is amplified. The models have to be trained not only for every question but also for every language the question is offered in. Moreover, the availability and time taken by experts to create a labeled set of programs for each question is a major bottleneck in scaling such a system. We address this issue by presenting a method to grade computer programs which requires no manually assigned labeled samples for grading responses to a new, unseen question. We extend our previous work [25] wherein we introduced a grammar of features to learn question specific models. In this work, we propose a method to transform those features into a set of features that maintain their structural relation with the labels across questions. Using these features we learn one supervised model, across questions for a given language, which can then be applied to an ungraded response to an unseen question. We show that our method rivals the performance of both, question specific models and the consensus among human experts while substantially outperforming extant ways of evaluating codes. We demonstrate the system single s value by deploying it to grade programs in a high stakes assessment. The learning from this work is transferable to other grading tasks such as math question grading and also provides a new variation to the supervised learning approach.

Supplementary Material

MP4 File (kdd2016_singh_program_grading_01-acm.mp4)

References

[1]
Automata. Aspiring Minds http://www.aspiringminds.com/technology/automata.
[2]
E-rater. ETS http://www.ets.org/research/topics/as_nlp/writing_quality/.
[3]
Intelli metric. Vantage Learning http://www.vantagelearning.com/products/intellimetric/.
[4]
Speechrater. ETS https://www.ets.org/research/topics/as_nlp/speech/.
[5]
Svar. Aspiring Minds http://www.aspiringminds.com/technology/svar.
[6]
V. Aggarwal, S. Srikant, and V. Shashidhar. Principles for using machine learning in the assessment of open response items: Programming assessment as a case study. In NIPS Workshop on Data Driven Education, 2013.
[7]
J. Baxter. A bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning, 28(1):7--39, 1997.
[8]
J. Bernstein, A. Van Moere, and J. Cheng. Validating automated speaking tests. Language Testing, 2010.
[9]
M. Birenbaum and K. K. Tatsuoka. Open-ended versus multiple-choice response formats-it does make a difference for diagnostic purposes. Applied Psychological Measurement, 11(4):385--395, 1987.
[10]
H. M. Breland. The direct assessment of writing skill: A measurement review. ETS Research Report Series, 1983(2):i--23, 1983.
[11]
J. Burstein, L. Braden-Harder, M. Chodorow, S. Hua, B. Kaplan, K. Kukich, C. Lu, J. Nolan, D. Rock, and S. Wolff. Computer analysis of essay content for automated score prediction: A prototype automated scoring system for gmat analytical writing assessment essays. ETS Research Report Series, 1998(1):i--67, 1998.
[12]
C.-C. Chang and C.-J. Lin. Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011.
[13]
H. Daume III and D. Marcu. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, pages 101--126, 2006.
[14]
E. L. Glassman, J. Scott, R. Singh, P. J. Guo, and R. C. Miller. Overcode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction (TOCHI), 22(2):7, 2015.
[15]
J. Huang, C. Piech, A. Nguyen, and L. Guibas. Syntactic and functional variability of a million code submissions in a machine learning mooc. In AIED 2013 Workshops Proceedings Volume, page 25. Citeseer, 2013.
[16]
A. S. Lan, D. Vats, A. E. Waters, and R. G. Baraniuk. Mathematical language processing: Automatic grading and feedback for open response mathematical questions. In Proceedings of the Second (2015) ACM Conference on Learning@ Scale, pages 167--176. ACM, 2015.
[17]
N. Meinshausen and P. Bühlmann. Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4):417--473, 2010.
[18]
L. Pappano. The year of the mooc. The New York Times (Accessed: 2016--2--2).
[19]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research, 12:2825--2830, 2011.
[20]
K. Rivers and K. R. Koedinger. Automatic generation of programming feedback: A data-driven approach. In The First Workshop on AI-supported Education for Computer Science (AIEDCS 2013), page 50, 2013.
[21]
V. Shashidhar, N. Pandey, and V. Aggarwal. Automatic spontaneous speech grading: A novel feature derivation technique using the crowd. In Proceedings of the Conference of the Association for Computational Linguistics. ACL, 2015.
[22]
V. Shashidhar, N. Pandey, and V. Aggarwal. Spoken english grading: Machine learning with crowd intelligence. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2089--2097. ACM, 2015.
[23]
R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In ACM SIGPLAN Notices, volume 48, pages 15--26. ACM, 2013.
[24]
V. Southavilay, K. Yacef, P. Reimann, and R. A. Calvo. Analysis of collaborative writing processes using revision maps and probabilistic topic models. In Proceedings of the Third International Conference on Learning Analytics and Knowledge, pages 38--47. ACM, 2013.
[25]
S. Srikant and V. Aggarwal. A system to grade computer programming skills using machine learning. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1887--1896. ACM, 2014.
[26]
S. Thrun. Is learning the n-th thing any easier than learning the first? Advances in neural information processing systems, pages 640--646, 1996.
[27]
C. Vleuten, G. Norman, and E. Graaff. Pitfalls in the pursuit of objectivity: issues of reliability. Medical education, 25(2):110--118, 1991.

Cited By

View all
  • (2024)Automated Grading and Feedback Tools for Programming Education: A Systematic ReviewACM Transactions on Computing Education10.1145/363651524:1(1-43)Online publication date: 19-Feb-2024
  • (2024)Feedback Generation for Automatic Programming Assessment Utilizing AI Techniques: An Initial Analysis of Systematic Mapping StudiesAdvances in Intelligent Computing Techniques and Applications10.1007/978-3-031-59711-4_23(257-272)Online publication date: 30-Jun-2024
  • (2023)Proactive and reactive engagement of artificial intelligence methods for education: a reviewFrontiers in Artificial Intelligence10.3389/frai.2023.11513916Online publication date: 5-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2016
2176 pages
ISBN:9781450342322
DOI:10.1145/2939672
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. MOOC
  2. automatic grading
  3. feature engineering
  4. one-class learning
  5. question independent learning
  6. recruitment
  7. supervised learning

Qualifiers

  • Research-article

Conference

KDD '16
Sponsor:

Acceptance Rates

KDD '16 Paper Acceptance Rate 66 of 1,115 submissions, 6%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)5
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Automated Grading and Feedback Tools for Programming Education: A Systematic ReviewACM Transactions on Computing Education10.1145/363651524:1(1-43)Online publication date: 19-Feb-2024
  • (2024)Feedback Generation for Automatic Programming Assessment Utilizing AI Techniques: An Initial Analysis of Systematic Mapping StudiesAdvances in Intelligent Computing Techniques and Applications10.1007/978-3-031-59711-4_23(257-272)Online publication date: 30-Jun-2024
  • (2023)Proactive and reactive engagement of artificial intelligence methods for education: a reviewFrontiers in Artificial Intelligence10.3389/frai.2023.11513916Online publication date: 5-May-2023
  • (2023)Machine Learning-Based Automated Grading and Feedback Tools for Programming: A Meta-AnalysisProceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 110.1145/3587102.3588822(491-497)Online publication date: 29-Jun-2023
  • (2023)VizProg: Identifying Misunderstandings By Visualizing Students’ Coding ProgressProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581516(1-16)Online publication date: 19-Apr-2023
  • (2023)Autograding of Programming Skills2023 IEEE 8th International Conference for Convergence in Technology (I2CT)10.1109/I2CT57861.2023.10126211(1-6)Online publication date: 7-Apr-2023
  • (2023)Towards Deep Learning Models for Automatic Computer Program Grading2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA60987.2023.10302571(1-10)Online publication date: 9-Oct-2023
  • (2022)An automatic grading system for a high school-level computational thinking courseProceedings of the 4th International Workshop on Software Engineering Education for the Next Generation10.1145/3528231.3528357(20-27)Online publication date: 17-May-2022
  • (2022)Algorithm identification in programming assignmentsProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527914(471-481)Online publication date: 16-May-2022
  • (2022)Automatic Grading of Student Code with Similarity MeasurementMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26422-1_18(286-301)Online publication date: 19-Sep-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media