Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3027385.3027399acmotherconferencesArticle/Chapter ViewAbstractPublication PageslakConference Proceedingsconference-collections
research-article
Public Access

Predicting math performance using natural language processing tools

Published: 13 March 2017 Publication History

Abstract

A number of studies have demonstrated links between linguistic knowledge and performance in math. Studies examining these links in first language speakers of English have traditionally relied on correlational analyses between linguistic knowledge tests and standardized math tests. For second language (L2) speakers, the majority of studies have compared math performance between proficient and non-proficient speakers of English. In this study, we take a novel approach and examine the linguistic features of student language while they are engaged in collaborative problem solving within an on-line math tutoring system. We transcribe the students' speech and use natural language processing tools to extract linguistic information related to text cohesion, lexical sophistication, and sentiment. Our criterion variables are individuals' pretest and posttest math performance scores. In addition to examining relations between linguistic features of student language production and math scores, we also control for a number of non-linguistic factors including gender, age, grade, school, and content focus (procedural versus conceptual). Linear mixed effect modeling indicates that non-linguistic factors are not predictive of math scores. However, linguistic features related to cohesion affect and lexical proficiency explained approximately 30% of the variance (R2 = .303) in the math scores.

References

[1]
Vukovic, R. K., & Lesaux, N. K. (2013). The relationship between linguistic skills and arithmetic knowledge. Learning and Individual Differences, 23, 87--91.
[2]
Adams, T. L. (2003). Reading math: More than words can say. The Reading Teacher, 56(8), 786--795.
[3]
LeFevre, J. A., Fast, L., Skwarchuk, S. L., Smith-Chant, B. L., Bisanz, J., Kamawar, D., & Penner-Wilger, M. (2010). Pathways to math: Longitudinal predictors of performance. Child development, 81(6), 1753--1767.
[4]
Cummins, J. (1979). Linguistic interdependence and the educational development of bilingual children. Review of Educational Research, 49, 222--251.
[5]
MacGregor, M., & Price, E. (1999). An exploration of aspects of language proficiency and algebra learning. Journal for Research in Math Education, 449--467.
[6]
Alt, M., Arizmendi, G. D., & Beal, C. R. (2014). The relationship between math and language: Academic implications for children with specific language impairment and English language learners. Language, speech, and hearing services in schools, 45(3), 220--233.
[7]
Hampden-Thompson, G., Mulligan, G., Kinukawa, A., & Halle, T. (2008). Math Achievement of Language-Minority Students During the Elementary Years. Washington, DC: U.S. Department of Education, National Center for Education Statistics.
[8]
Martiniello, M. (2009). Linguistic complexity, schematic representations, and differential item functioning for English language learners in math tests. Educational assessment, 14(3--4), 160--179.
[9]
Hernandez, F. (2013). The Relationship Between Reading and Math Achievement of Middle School Students as Measured by the Texas Assessment of Knowledge and Skills (Doctoral dissertation).
[10]
Hampden-Thompson, G., Mulligan, G., Kinukawa, A., & Halle, T. (2008). Math Achievement of Language-Minority Students During the Elementary Years. Washington, DC: U.S. Department of Education, National Center for Education Statistics.
[11]
Ardasheva, Y., Tretter, T., Kinny, M. (2012). English Language Learners and Academic Achievement: Revisiting the Threshold Hypothesis. Language Learning, 62(3), 769--812.
[12]
Mosqueda, E., & Maldonado, S. I. (2013). The effects of English language proficiency and curricular pathways: Latina/os' math achievement in secondary schools. Equity & Excellence in Education, 46(2), 202--219.
[13]
Wang, J., & Goldschmidt, P. (1999). Opportunity to learn, language proficiency, and immigrant status effects on math achievement. The Journal of Educational Research, 93(2), 101--111.
[14]
Aleven, V., McLaren, B.M., Sewall, J., & Koedinger, K.R. (2009). A New Paradigm for Intelligent Tutoring Systems: Example-Tracing Tutors. International Journal of Artificial Intelligence in Education, 19(2), 105--154.
[15]
Olsen, J. K., Belenky, D. M., Aleven, V., Rummel, N., & Ringenberg, M. Authoring collaborative intelligent tutoring systems. . In Lane, H. C., Yacef, K., Mostow, J., & Pavlik, P. (Eds.). Proceedings of the Artificial Intelligence in Education (AIED) Conference. Heidelberg, Germany: Springer.
[16]
Rau, M. A., Aleven, V., & Rummel, N. (2009, July). Intelligent Tutoring Systems with Multiple Representations and Self-Explanation Prompts Support Learning of Fractions. In Proceedings of the Artificial Intelligence in Education (AIED) Conference. (pp. 441--448). Heidelberg, Germany: Springer.
[17]
Rau, M. A., Aleven, V., Rummel, N., & Rohrbach, S. (2012). Sense making alone doesn't do it: Fluency matters too! ITS support for robust learning with multiple representations. In International Conference on Intelligent Tutoring Systems (pp. 174--184). Springer Berlin Heidelberg.
[18]
Kyle, K., & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, 49(4), 757--786.
[19]
Crossley, S. A., Kyle, K., & McNamara, D. S. (in press). The tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text cohesion. Behavior Research Methods.
[20]
Crossley, S. A., Kyle, K., & McNamara, D. S. (in press). Sentiment Analysis and Social Cognition Engine (SEANCE): An Automatic Tool for Sentiment, Social Cognition, and Social Order Analysis. Behavior Research Methods. (Thorndike & Lorge, 1944)
[21]
Thorndike, E. L., & Lorge, I. (1944). The teacher's wordbook of 30,000 words. New York: Columbia University, Teachers College: Bureau of Publications.
[22]
Kučera, H., & Francis, N. (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press.
[23]
Brown, G. D. (1984). A frequency count of 190,000 words in theLondon-Lund Corpus of English Conversation. Behavior Research Methods, Instruments, & Computers, 16(6), 502--532.
[24]
Brysbaert, M., & New, B. (2009). Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41(4), 977--990.
[25]
The British National Corpus, version 3 (BNC XML Edition). 2007. Distributed by Oxford University Computing Services on behalf of the BNC Consortium. URL: http://www.natcorp.ox.ac.uk/
[26]
Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213--238.
[27]
Simpson-Vlach, R., & Ellis, N. C. (2010). An academic formulas list: New methods in phraseology research. Applied LInguistics, 31(4), 487--512.
[28]
Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3), 904--911.Coltheart, 1981
[29]
Coltheart, M. (1981). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology, 33(4), 497--505.
[30]
Kuperman, V., Stadthagen-Gonzalez, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30,000 English words. Behavior Research Methods, 44(4), 978--990.
[31]
Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python. O'Reilly Media, Inc.
[32]
Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38(11), 39--41.
[33]
Cambria, E., Grassi, M., Hussain, A., & Havasi, C. (2012). Sentic computing for social media marketing. Multimedia tools and applications, 59(2), 557--577.
[34]
Cambria, E., Speer, R., Havasi, C., & Hussain, A. (2010). SenticNet: A Publicly Available Semantic Resource for Opinion Mining. Paper presented at the AAAI fall symposium: commonsense knowledge.
[35]
Mohammad, S. M., & Turney, P. D. (2010). Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. Paper presented at the Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text.
[36]
Mohammad, S. M., & Turney, P. D. (2013). Crowdsourcing a word-emotion association lexicon. Computational Intelligence, 29(3), 436--465.
[37]
Lasswell, H. D., & Namenwirth, J. Z. (1969). The Lasswell Value Dictionary. New Haven: Yale University Press.
[38]
Scherer, K. R. (2005). What are emotions? And how can they be measured? Social science information, 44(4), 695--729
[39]
Polanyi, L., & Zaenen, A. (2006). Contextual Valence Shifters. In J. G. Shanahan, Y. Qu, & J. Wiebe (Eds.), Computing Attitude and Affect in Text: Theory and Applications (pp. 1--10). Dordrecht: Springer Netherlands.
[40]
Hutto, C. J., & Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. Paper presented at the 8th Int. AAAI Conf. on Weblogs and Social Media, Ann Arbor, MI.
[41]
Toutanova, K., Klein, D., Manning, C. D., & Singer, Y. (2003). Feature-rich part-of-speech tagging with a cyclic dependency network. Paper presented at the Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1.
[42]
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. Paper presented at the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, MA.
[43]
R Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2013: ISBN 3-900051-07-0.
[44]
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823.
[45]
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2015). Package 'lmerTest'. R package version, 2.0-29.

Cited By

View all
  • (2024)Lexical ambiguities in statistics declared by in training and in-service teachersEurasia Journal of Mathematics, Science and Technology Education10.29333/ejmste/1435920:4(em2422)Online publication date: 2024
  • (2024)Investigating the relationship between math literacy and linguistic synchrony in online mathematical discussions through large‐scale data analyticsBritish Journal of Educational Technology10.1111/bjet.1344455:5(2226-2256)Online publication date: 27-Feb-2024
  • (2023)Beyond Words and NumbersChildhood Developmental Language Disorders10.4018/979-8-3693-1982-6.ch019(310-324)Online publication date: 24-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
LAK '17: Proceedings of the Seventh International Learning Analytics & Knowledge Conference
March 2017
631 pages
ISBN:9781450348706
DOI:10.1145/3027385
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. educational data mining
  2. natural language processing
  3. on-line tutoring systems
  4. predictive analytics
  5. sentiment analysis

Qualifiers

  • Research-article

Funding Sources

Conference

LAK '17
LAK '17: 7th International Learning Analytics and Knowledge Conference
March 13 - 17, 2017
British Columbia, Vancouver, Canada

Acceptance Rates

LAK '17 Paper Acceptance Rate 36 of 114 submissions, 32%;
Overall Acceptance Rate 236 of 782 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)211
  • Downloads (Last 6 weeks)25
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Lexical ambiguities in statistics declared by in training and in-service teachersEurasia Journal of Mathematics, Science and Technology Education10.29333/ejmste/1435920:4(em2422)Online publication date: 2024
  • (2024)Investigating the relationship between math literacy and linguistic synchrony in online mathematical discussions through large‐scale data analyticsBritish Journal of Educational Technology10.1111/bjet.1344455:5(2226-2256)Online publication date: 27-Feb-2024
  • (2023)Beyond Words and NumbersChildhood Developmental Language Disorders10.4018/979-8-3693-1982-6.ch019(310-324)Online publication date: 24-Nov-2023
  • (2023)Are We on the Same Page? Modeling Linguistic Synchrony and Math Literacy in Mathematical DiscussionsLAK23: 13th International Learning Analytics and Knowledge Conference10.1145/3576050.3576082(599-605)Online publication date: 13-Mar-2023
  • (2022)Do Speech-Based Collaboration Analytics Generalize Across Task Contexts?LAK22: 12th International Learning Analytics and Knowledge Conference10.1145/3506860.3506894(208-218)Online publication date: 21-Mar-2022
  • (2022)Math Discourse Linguistic Components (Cohesive Cues within a Math Discussion Board Discourse)Proceedings of the Ninth ACM Conference on Learning @ Scale10.1145/3491140.3528320(389-394)Online publication date: 1-Jun-2022
  • (2022)Research on Behavior Analysis of Real-Time Online Teaching for College Students Based on Head Gesture RecognitionIEEE Access10.1109/ACCESS.2022.319234910(81476-81491)Online publication date: 2022
  • (2021)Learning behaviours data in programming education: Community analysis and outcome prediction with cleaned dataFuture Generation Computer Systems10.1016/j.future.2021.08.026Online publication date: Oct-2021
  • (2021)Expertise Detection in Crowdsourcing Forums Using the Composition of Latent Topics and Joint Syntactic–Semantic CuesSN Computer Science10.1007/s42979-021-00832-02:6Online publication date: 3-Sep-2021
  • (2021)Unpacking Contributions of Morphosyntactic Awareness and Vocabulary to Science Reading Comprehension among Linguistically Diverse StudentsTESOL Quarterly10.1002/tesq.303955:3(931-965)Online publication date: 27-May-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media