research-article

Free access

Text-to-text semantic similarity for automatic short answer grading

Authors:

Michael Mohler,

Rada MihalceaAuthors Info & Claims

EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics

Pages 567 - 575

Published: 30 March 2009 Publication History

Abstract

In this paper, we explore unsupervised techniques for the task of automatic short answer grading. We compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating automatic feedback from the student answers. Overall, our system significantly and consistently outperforms other unsupervised methods for short answer grading that have been proposed in the past.

References

[1]

D. Callear, J. Terrains-Smith, and V. Soh. 2001. CAA of Short Non-MCQ Answers. Proceedings of the 5th International Computer Assisted Assessment conference.

[2]

E. Gabrilovich and S. Markovitch. 2006. Overcoming the brittleness bottleneck using Wikipedia: Enhancing text categorization with encyclopedic knowledge. In Proceedings of the National Conference on Artificial Intelligence (AAAI), Boston.

Digital Library

[3]

E. Gabrilovich and S. Markovitch. 2007. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. Proceedings of the 20th International Joint Conference on Artificial Intelligence, pages 6--12.

Digital Library

[4]

V. Hatzivassiloglou, J. Klavans, and E. Eskin. 1999. Detecting text similarity over short passages: Exploring linguistic feature combinations via machine learning. Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora.

[5]

D. Higgins, J. Burstein, D. Marcu, and C. Gentile. 2004. Evaluating multiple aspects of coherence in student essays. In Proceedings of the annual meeting of the North American Chapter of the Association for Computational Linguistics. Boston, MA.

[6]

G. Hirst and D. St-Onge, 1998. Lexical chains as representations of contexts for the detection and correction of malaproprisms. The MIT Press.

[7]

J. Jiang and D. Conrath. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the International Conference on Research in Computational Linguistics, Taiwan.

[8]

D. Kanejiya, A. Kumar, and S. Prasad. 2003. Automatic evaluation of students' answers using syntactically enhanced LSA. Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing-Volume 2, pages 53--60.

Digital Library

[9]

T. K. Landauer and S. T. Dumais. 1997. A solution to plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104.

[10]

C. Leacock and M. Chodorow. 1998. Combining local context and WordNet sense similarity for word sense identification. In WordNet. An Electronic Lexical Database. The MIT Press.

[11]

C. Leacock and M. Chodorow. 2003. C-rater: Automated Scoring of Short-Answer Questions. Computers and the Humanities, 37(4):389--405.

[12]

M. D. Lee, B. Pincombe, and M. Welsh. 2005. An empirical evaluation of models of text document similarity. Proceedings of the 27th Annual Conference of the Cognitive Science Society, pages 1254--1259.

[13]

M. E. Lesk. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of the SIGDOC Conference 1986, Toronto, June.

Digital Library

[14]

D. Lin. 1998. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison, WI.

Digital Library

[15]

K. I. Malatesta, P. Wiemer-Hastings, and J. Robertson. 2002. Beyond the Short Answer Question with Research Methods Tutor. In Proceedings of the Intelligent Tutoring Systems Conference.

Digital Library

[16]

R. Mihalcea, C. Corley, and C. Strapparava. 2006. Corpus-based and knowledge-based approaches to text semantic similarity. In Proceedings of the American Association for Artificial Intelligence (AAAI 2006), Boston.

Digital Library

[17]

T. Mitchell, T. Russell, P. Broomhead, and N. Aldridge. 2002. Towards robust computerised marking of free-text responses. Proceedings of the 6^th International Computer Assisted Assessment (CAA) Conference.

[18]

Alessandro Moschitti, Silvia Quarteroni, Roberto Basili, and Suresh Manandhar. 2007. Exploiting syntactic and shallow semantic kernels for question/answer classification. In Proceedings of the 45th Conference of the Association for Computational Linguistics.

[19]

S. Patwardhan, S. Banerjee, and T. Pedersen. 2003. Using measures of semantic relatedness for word sense disambiguation. In Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, February.

Digital Library

[20]

T. Pedersen, S. Patwardhan, and J. Michelizzi. 2004. WordNet:: Similarity-Measuring the Relatedness of Concepts. Proceedings of the National Conference on Artificial Intelligence, pages 1024--1025.

Digital Library

[21]

S. G. Pulman and J. Z. Sukkarieh. 2005. Automatic Short Answer Marking. ACL WS Bldg Ed Apps using NLP.

Digital Library

[22]

P. Resnik. 1995. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada.

Digital Library

[23]

J. Rocchio, 1971. Relevance feedback in information retrieval. Prentice Hall, Ing. Englewood Cliffs, New Jersey.

[24]

G. Salton, A. Wong, and C. S. Yang. 1997. A vector space model for automatic indexing. In Readings in Information Retrieval, pages 273--280. Morgan Kaufmann Publishers, San Francisco, CA.

Digital Library

[25]

J. Z. Sukkarieh, S. G. Pulman, and N. Raikes. 2004. Auto-Marking 2: An Update on the UCLES-Oxford University research into using Computational Linguistics to Score Short. Free Text Responses. International Association of Educational Assessment, Philadephia.

[26]

P. Wiemer-Hastings, K. Wiemer-Hastings, and A. Graesser. 1999. Improving an intelligent tutor's comprehension of students with Latent Semantic Analysis. Artificial Intelligence in Education, pages 535--542.

[27]

P. Wiemer-Hastings, E. Arnott, and D. Allbritton. 2005. Initial results and mixed directions for research methods tutor. In AIED2005 - Supplementary Proceedings of the 12th International Conference on Artificial Intelligence in Education, Amsterdam.

[28]

Z. Wu and M. Palmer. 1994. Verb semantics and lexical selection. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico.

Digital Library

Cited By

Yeruva NVenna SIndukuri HMarreddy M(2022)Triplet Loss based Siamese Networks for Automatic Short Answer GradingProceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3574318.3574337(60-64)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3574318.3574337
Ratna APurnamasari PAnandra NLuhurkinanti D(2022)Hybrid Deep Learning CNN-Bidirectional LSTM and Manhattan Distance for Japanese Automated Short Answer GradingProceedings of the 8th International Conference on Communication and Information Processing10.1145/3571662.3571666(22-27)Online publication date: 3-Nov-2022
https://dl.acm.org/doi/10.1145/3571662.3571666
Bian WAlam OKienzle JSyriani ESahraoui H(2020)Is automated grading of models effective?Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems10.1145/3365438.3410944(365-376)Online publication date: 16-Oct-2020
https://dl.acm.org/doi/10.1145/3365438.3410944
Show More Cited By

Index Terms

Text-to-text semantic similarity for automatic short answer grading
1. Computing methodologies

Recommendations

Sentence similarity analysis with applications in automatic short answer grading
A framework for effectively utilising human grading input in automated short answer grading

Short answer questions are effective for recall knowledge assessment. Grading a large amount of short answers is costly and time consuming. To apply short answer questions on MOOCs platforms, the issues of scalability and responsiveness must be addressed. ...
Transfer learning for automatic short answer grading
ECAI'16: Proceedings of the Twenty-second European Conference on Artificial Intelligence

Automatic short answer grading (ASAG) is the task of automatically grading students answers which are a few words to a few sentences long. While supervised machine learning techniques (classification, regression) have been successfully applied for ASAG, ...

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics

March 2009

905 pages

General Chair:
Alex Lascarides
University of Edinburgh (UK)
,
Program Chairs:
Claire Gardent
CNRS/LORIA Nancy (France)
,
Joakim Nivre
Uppsala University and Vaxjo University (Sweden)

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 30 March 2009

Qualifiers

Research-article

Acceptance Rates

EACL '09 Paper Acceptance Rate 100 of 360 submissions, 28%;

Overall Acceptance Rate 100 of 360 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

31
Total Citations
View Citations
2,042
Total Downloads

Downloads (Last 12 months)95
Downloads (Last 6 weeks)17

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yeruva NVenna SIndukuri HMarreddy M(2022)Triplet Loss based Siamese Networks for Automatic Short Answer GradingProceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3574318.3574337(60-64)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3574318.3574337
Ratna APurnamasari PAnandra NLuhurkinanti D(2022)Hybrid Deep Learning CNN-Bidirectional LSTM and Manhattan Distance for Japanese Automated Short Answer GradingProceedings of the 8th International Conference on Communication and Information Processing10.1145/3571662.3571666(22-27)Online publication date: 3-Nov-2022
https://dl.acm.org/doi/10.1145/3571662.3571666
Bian WAlam OKienzle JSyriani ESahraoui H(2020)Is automated grading of models effective?Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems10.1145/3365438.3410944(365-376)Online publication date: 16-Oct-2020
https://dl.acm.org/doi/10.1145/3365438.3410944
Saif AOmar NAb Aziz MZainodin USalim N(2018)Semantic concept model using Wikipedia semantic featuresJournal of Information Science10.1177/016555151770623144:4(526-551)Online publication date: 1-Aug-2018
https://dl.acm.org/doi/10.1177/0165551517706231
Marvaniya SSaha SDhamecha TFoltz PSindhgatta RSengupta BCuzzocrea AAllan JPaton NSrivastava DAgrawal RBroder AZaki MCandan SLabrinidis ASchuster AWang H(2018)Creating Scoring Rubric from Representative Student Answers for Improved Short Answer GradingProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3271755(993-1002)Online publication date: 17-Oct-2018
https://dl.acm.org/doi/10.1145/3269206.3271755
Ponza MFerragina PChakrabarti SLim EWinslett MSanderson MFu ASun JCulpepper SLo EHo JDonato DAgrawal RZheng YCastillo CSun ATseng VLi C(2017)A Two-Stage Framework for Computing Entity Relatedness in WikipediaProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3132890(1867-1876)Online publication date: 6-Nov-2017
https://dl.acm.org/doi/10.1145/3132847.3132890
Nanda RDi Caro LBoella GKonstantinov HTyankov TTraykov DHristov HCostamagna FHumphreys LRobaldo LRomano MKeppens JGovernatori G(2017)A unifying similarity measure for automated identification of national implementations of european union directivesProceedings of the 16th edition of the International Conference on Articial Intelligence and Law10.1145/3086512.3086527(149-158)Online publication date: 12-Jun-2017
https://dl.acm.org/doi/10.1145/3086512.3086527
Hsiao ILin Y(2017)Enriching programming content semanticsComputers in Human Behavior10.1016/j.chb.2016.10.01272:C(771-782)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1016/j.chb.2016.10.012
Li XJiang ZSong BLiu L(2017)Long-term knowledge evolution modeling for empirical engineering knowledgeAdvanced Engineering Informatics10.1016/j.aei.2017.08.00134:C(17-35)Online publication date: 1-Oct-2017
https://dl.acm.org/doi/10.1016/j.aei.2017.08.001
Roy SBhatt HNarahari Y(2016)Transfer learning for automatic short answer gradingProceedings of the Twenty-second European Conference on Artificial Intelligence10.3233/978-1-61499-672-9-1622(1622-1623)Online publication date: 29-Aug-2016
https://dl.acm.org/doi/10.3233/978-1-61499-672-9-1622
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten