article

Recognizing entailment in intelligent tutoring systems*

Authors:

Rodney d. Nielsen,

James h. MartinAuthors Info & Claims

Natural Language Engineering, Volume 15, Issue 4

Pages 479 - 501

https://doi.org/10.1017/S135132490999012X

Published: 01 October 2009 Publication History

Abstract

This paper describes a new method for recognizing whether a student's response to an automated tutor's question entails that they understand the concepts being taught. We demonstrate the need for a finer-grained analysis of answers than is supported by current tutoring systems or entailment databases and describe a new representation for reference answers that addresses these issues, breaking them into detailed facets and annotating their entailment relationships to the student's answer more precisely. Human annotation at this detailed level still results in substantial interannotator agreement (86.2%), with a kappa statistic of 0.728. We also present our current efforts to automatically assess student answers, which involves training machine learning classifiers on features extracted from dependency parses of the reference answer and student's response and features derived from domain-independent lexical statistics. Our system's performance, as high as 75.5% accuracy within domain and 68.8% out of domain, is very encouraging and confirms the approach is feasible. Another significant contribution of this work is that it represents a significant step in the direction of providing domain-independent semantic assessment of answers. No prior work in the area of tutoring or educational assessment has attempted to build such domain-independent systems. They have virtually all required hundreds of examples of learner answers for each new question in order to train aspects of their systems or to hand-craft information extraction templates.

References

[1]

Agichtein, E., and Gravano, L. 2000. Snowball: extracting relations from large plaintext collections. In Proceedings of the 5th ACM ICDL, Kyoto, Japan.

[2]

Aleven, V., Popescu, O., and Koedinger, K. R. 2001. A tutorial dialogue system with knowledge-based understanding and classification of student explanations. In IJCAI Workshop on Knowledge and Reasoning in Practical Dialogue Systems, Seattle, WA.

[3]

Bar-Haim, R., Szpektor, I., and Glickman, O. 2005. Definition and analysis of intermediate entailment levels. In Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, Ann Arbor, MI.

[4]

Barzilay, R., and Lee, L. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In Proceedings of the HLT-NAACL, Edmonton, Canada, pp. 16-23.

[5]

Barzilay, R., and McKeown, K. 2001. Extracting paraphrases from a parallel corpus. In Proceedings of the ACL/EACL, Toulouse, France, pp. 50-7.

[6]

Braz, R. S., Girju, R., Punyakanok, V., Roth, D., and Sammons, M. 2005. An inference model for semantic entailment in natural language. In Proceedings of the PASCAL Recognizing Textual Entailment Challenge Workshop, Southampton, UK.

[7]

Burger, J., and Ferro, L. 2005. Generating an entailment corpus from news headlines. In Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, Ann Arbor, MI, pp. 49-54.

[8]

Callear, D., Jerrams-Smith, J., and Soh, V. 2001. CAA of short non-MCQ answers. In Proceedings of the 5th International CAA Conference, Loughborough.

[9]

Cohen, J. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20: 37-46.

[10]

Dagan, I., Glickman, O., and Magnini, B. 2005. The PASCAL Recognizing Textual Entailment Challenge. In Proceedings of the PASCAL RTE Challenge Workshop, Southampton, UK.

[11]

Dolan, W. B., Quirk, C., and Brockett, C. 2004. Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In Proceedings of COLING 2004, Geneva, Switzerland.

[12]

Giampiccolo, D., Magnini, B., Dagan, I., and Dolan, B. 2007. The Third PASCAL Recognizing Textual Entailment Challenge. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic.

[13]

Gildea, D., and Jurafsky, D. 2002. Automatic labeling of semantic roles. Computational Linguistics 28(3): 245-88.

Digital Library

[14]

Glickman, O., and Dagan, I, 2003. Identifying lexical paraphrases from a single corpus: a case study for verbs. In Proceedings of RANLP, Borovets, Bulgaria.

[15]

Glickman, O., Dagan, I., and Koppel, M. 2005. Web based probabilistic textual entailment. In Proceedings of the PASCAL RTE Challenge Workshop, Southampton, UK.

[16]

Graesser, A. C., Hu, X., Susarla, S., Harter, D., Person, N. K., Louwerse, M., Olde, B., and the Tutoring Research Group. 2001. AutoTutor: an intelligent tutor and conversational tutoring scaffold. In Proceedings of the 10th International Conference of Artificial Intelligence in Education, San Antonio, TX, pp. 47-9.

[17]

Grice, H. P. 1975. Logic and conversation. In P. Cole and J. Morgan (eds.), Syntax and Semantics, Vol 3, Speech Acts, 43-58. Academic Press, New York.

[18]

Hickl, A., and Bensley, J. 2007. A discourse commitment-based framework for recognizing textual entailment. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Southampton, UK.

[19]

Kipper, K., Dang, H. T., and Palmer, M. 2000. Class-based construction of a verb lexicon. In AAAI Seventeenth National Conference on Artificial Intelligence, Austin, TX.

Digital Library

[20]

Landauer, T. K., and Dumais, S. T. 1997. A solution to Plato's problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Journal of Psychological Review 104(2): 211-240.

[21]

Lawrence Hall of Science. 2005. Full Option Science System (FOSS). Nashua, NH: University of California at Berkeley, Delta Education.

[22]

Leacock, C., and Chodorow, M. 2003. C-rater: automated scoring of short-answer questions. Computers and the Humanities 37(4): 389-405.

[23]

Lin, D., and Pantel, P. 2001. Discovery of inference rules for question answering. Natural Language Engineering 7(4): 343-60.

Digital Library

[24]

Long, K., Malone, L., and De Lucchi, L. 2008. Assessing science knowledge: Seeing more through the formative assessment lens. In J. Coffey, R. Douglas and C. Stearns (eds.), Assessing science learning: Perspectives from research and practice, Arlington, VA: National Science Teachers Association, pp. 167-90.

[25]

MacCartney, B., Grenager, T., de Marneffe, M., Cer, D., and Manning, C. 2006. Learning to recognize features of valid textual entailments. In Proceedings of HLT-NAACL, New York, NY.

[26]

Makatchev, M., Jordan, P., and VanLehn, K. 2004. Abductive theorem proving for analyzing student explanations and guiding feedback in intelligent tutoring systems. Journal of Automated Reasoning (special issue on automated reasoning and theorem proving in education) 32(3): 187-226.

Digital Library

[27]

Mitchell, T., Russell, T., Broomhead, P., and Aldridge, N. 2002. Towards robust computerized marking of free-text responses. In Proceedings of 6th International Computer Aided Assessment Conference, Loughborough.

[28]

Nielsen, R. D., and Ward, W. 2007. A corpus of fine-grained entailment relations. In Proceedings of the ACL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic.

[29]

Nielsen, R. D., Ward, W., and Martin, J. H. 2006. Toward dependency path based entailment. In Proceedings of the 2nd PASCAL RTE Challenge Workshop, Venice, Italy.

[30]

Nielsen, R. D., Ward, W., and Martin, J. H. 2007. Soft computing in intelligent tutoring systems and educational assessment. In Soft Computing Applications in Business, Springer-Verlag, Heidelberg, Germany, pp. 201-30.

[31]

Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kubler, S., Marinov, S., and Marsi, E. 2007. MaltParser: a language-independent system for data-driven dependency parsing. Natural Language Engineering 13(2): 95-135.

[32]

Pang, B., Knight, K., and Marcu, D. 2003 Syntax-based alignment of multiple translations: extracting paraphrases and generating sentences. In Proceedings of the HLT/NAACL, Edmonton, Canada.

[33]

Pon-Barry, H., Clark, B., Schultz, K., Bratt, E. O., and Peters, S. 2004 Contextualizing learning in a reflective conversational tutor. In Proceedings of the 4th IEEE International Conference on Advanced Learning Technologies, Joensuu, Finland.

[34]

Quinlan, J. R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.

[35]

Raina, R., Haghighi, A., Cox, C., Finkel, J., Michels, J., Toutanova, K., MacCartney, B., de Marneffe, M. C., Manning, C. D., and Ng, A. Y. 2005. Robust textual inference using diverse knowledge sources. In Proceedings of the PASCAL RTE Challenge Workshop, Southampton, UK.

[36]

Ravichandran, D., and Hovy, E. 2002. Learning surface text patterns for a question answering system. In Proceedings of the 40th ACL Conference, Philadelphia, PA.

[37]

Rosé, C. P., Roque, A., Bhembe, D., and VanLehn, K. 2003. A hybrid text classification approach for analysis of student essays. In Proceedings of the HLT-NAACL03 Workshop on Building Educational Applications Using Natural Language Processing, Sapporo, Japan, pp. 68-75.

[38]

Sudo, K., Sekine, S., and Grishman, R. 2001. Automatic pattern acquisition for Japanese information extraction. In Proceedings of HLT, San Diego, CA.

[39]

Sukkarieh, J. Z., Pulman, S. G., and Raikes, N. 2003. Auto-marking: using computational linguistics to score short, free text responses. In Proceedings of the 29th Conference of the International Association for Educational Assessment, Manchester, UK.

[40]

Tatu, M., and Moldovan, D. 2007. COGEX at RTE 3. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague.

[41]

Turney, P. D. 2001. Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of 12th European Conference on Machine Learning, Freiburg, Germany, pp. 491-502.

[42]

Vanderwende, L., Coughlin, D., and Dolan, W. B. (2005) What syntax can contribute in the entailment task. In Proceedings of the PASCAL Workshop for Recognizing Textual Entailment, Southampton, UK.

Cited By

Qiao CHu X(2023)Leveraging Semantic Facets for Automatic Assessment of Short Free Text AnswersIEEE Transactions on Learning Technologies10.1109/TLT.2022.319946916:1(26-39)Online publication date: 1-Feb-2023
https://dl.acm.org/doi/10.1109/TLT.2022.3199469
Yeruva NVenna SIndukuri HMarreddy M(2022)Triplet Loss based Siamese Networks for Automatic Short Answer GradingProceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3574318.3574337(60-64)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3574318.3574337
Ni PLi YLi GChang V(2021)A Hybrid Siamese Neural Network for Natural Language Inference in Cyber-Physical SystemsACM Transactions on Internet Technology10.1145/341820821:2(1-25)Online publication date: 15-Mar-2021
https://dl.acm.org/doi/10.1145/3418208
Show More Cited By

Index Terms

Recognizing entailment in intelligent tutoring systems*
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
2. Hardware
  1. Power and energy
    1. Power estimation and optimization
      1. Platform power issues

Index terms have been assigned to the content through auto-classification.

Recommendations

SPARTE, a test suite for recognising textual entailment in spanish
CICLing'06: Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing

The aim of Recognising Textual Entailment (RTE) is to determine whether the meaning of a text entails the meaning of another text named hypothesis. RTE systems can be applied to validate the answers of Question Answering (QA) systems. Once the answer to ...
Learner answer assessment in intelligent tutoring systems
Developing Adaptive and Intelligent Tutoring Systems (AITS): A General Framework and Its Implementations

Several adaptive and intelligent tutoring systems (AITS) have been developed with different variables. These variables were the cognitive traits, cognitive styles, and learning behavior. However, these systems neglect the importance of learner's ...

Comments

Information & Contributors

Information

Published In

cover image Natural Language Engineering

Natural Language Engineering Volume 15, Issue 4

October 2009

130 pages

ISSN:1351-3249

Issue’s Table of Contents

Publisher

Cambridge University Press

United States

Publication History

Published: 01 October 2009

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Qiao CHu X(2023)Leveraging Semantic Facets for Automatic Assessment of Short Free Text AnswersIEEE Transactions on Learning Technologies10.1109/TLT.2022.319946916:1(26-39)Online publication date: 1-Feb-2023
https://dl.acm.org/doi/10.1109/TLT.2022.3199469
Yeruva NVenna SIndukuri HMarreddy M(2022)Triplet Loss based Siamese Networks for Automatic Short Answer GradingProceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3574318.3574337(60-64)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3574318.3574337
Ni PLi YLi GChang V(2021)A Hybrid Siamese Neural Network for Natural Language Inference in Cyber-Physical SystemsACM Transactions on Internet Technology10.1145/341820821:2(1-25)Online publication date: 15-Mar-2021
https://dl.acm.org/doi/10.1145/3418208
Gupta AKaur MMittal SGarg S(2021)PE-MSC: partial entailment-based minimum set cover for text summarizationKnowledge and Information Systems10.1007/s10115-020-01537-163:5(1045-1068)Online publication date: 1-May-2021
https://dl.acm.org/doi/10.1007/s10115-020-01537-1
Liu TDing WWang ZTang JHuang GLiu Z(2019)Automatic Short Answer Grading via Multiway Attention NetworksArtificial Intelligence in Education10.1007/978-3-030-23207-8_32(169-173)Online publication date: 25-Jun-2019
https://dl.acm.org/doi/10.1007/978-3-030-23207-8_32
Bulgarov FNielsen RMcIlraith SWeinberger K(2018)Proposition entailment in educational applications using deep neural networksProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3505053(8053-8054)Online publication date: 2-Feb-2018
https://dl.acm.org/doi/10.5555/3504035.3505053
Bulgarov FNielsen RMcIlraith SWeinberger K(2018)Proposition entailment in educational applications using deep neural networksProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3504653(5045-5052)Online publication date: 2-Feb-2018
https://dl.acm.org/doi/10.5555/3504035.3504653
Neji MBen Khalifa WSouilem D(2018)E-Assessment System for Open and Short Answer Applied to a Course of Arabic Grammar in 7th Year in TunisiaInternational Journal of Online Pedagogy and Course Design10.4018/IJOPCD.20180701028:3(18-32)Online publication date: 1-Jul-2018
https://dl.acm.org/doi/10.4018/IJOPCD.2018070102
Lopez-Gazpio IMaritxalar MGonzalez-Agirre ARigau GUria LAgirre E(2017)Interpretable semantic textual similarityKnowledge-Based Systems10.1016/j.knosys.2016.12.013119:C(186-199)Online publication date: 1-Mar-2017
https://dl.acm.org/doi/10.1016/j.knosys.2016.12.013
Pukharenko YNorin V(2017)Issues of teaching metrology in higher education institutions of civil engineering in RussiaEducation and Information Technologies10.1007/s10639-016-9486-922:3(1217-1230)Online publication date: 1-May-2017
https://dl.acm.org/doi/10.1007/s10639-016-9486-9
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents