Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Simulated Learners in Educational Technology: A Systematic Literature Review and a Turing-like Test

  • Article
  • Published:
International Journal of Artificial Intelligence in Education Aims and scope Submit manuscript

Abstract

Simulation is a powerful approach that plays a significant role in science and technology. Computational models that simulate learner interactions and data hold great promise for educational technology as well. Amongst others, simulated learners can be used for teacher training, for generating and evaluating hypotheses on human learning, for developing adaptive learning algorithms, for building virtual worlds in which students can practice collaboration skills with simulated pals, and for testing learning environments. This paper provides the first systematic literature review on simulated learners in the broad area of artificial intelligence in education and related fields, focusing on the decade 2010-19. We analyze the trends regarding the use of simulated learners in educational technology within this decade, the purposes for which simulated learners are being used, and how the validity of the simulated learners is assessed. We find that simulated learner models tend to represent only narrow aspects of student learning. And, surprisingly, we also find that almost half of the studies using simulated learners do not provide any evidence that their modeling addresses the most fundamental question in simulation design – is the model valid? This poses a threat to the reliability of results that are based on these models. Based on our findings, we propose that future research should focus on developing more complete simulated learner models. To validate these models, we suggest a standard and universal criterion, which is based on the lasting idea of Turing’s Test. We discuss the properties of this test and its potential to move the field of simulated learners forward.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Abdi, S., Khosravi, H., Sadiq, S. W., & Gasevic, D. (2019). A multivariate elo-based learner model for adaptive educational systems. In M.C. Desmarais, C. F.Lynch, A. Merceron, & R. Nkambou,(Eds.), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS)

  • Alexandron, G., Yoo, L. Y., Ruipérez-Valiente, J. A., Lee, S., & Pritchard, D. E. (2019). Are mooc learning analytics results trustworthy? with fake learners, they might not be! International Journal of Artificial Intelligence in Education, 29, 484–506.

    Article  Google Scholar 

  • Arikan, Ç. A. (2018). The effect of mini and midi anchor tests on test equating. International Journal of Progressive Education, 14(2), 148–160.

    Article  Google Scholar 

  • Aşiret, S. & Sünbül, S.Ö. (2016). Investigating test equating methods in small samples through various factors. Educational Sciences: Theory & Practice, 16(2)

  • Badiee, F., & Kaufman, D. (2015). Design evaluation of a simulation for teacher education. Sage Open, 5(2), 2158244015592454.

    Article  Google Scholar 

  • Bartocci, E., & Lió, P. (2016). Computational modeling, formal analysis, and tools for systems biology. PLOS Computational Biology, 12(1), 1–22.

    Article  Google Scholar 

  • Bazaldua, D. A. L., Lee, Y.-S., Keller, B., & Fellers, L. (2017). Assessing the performance of classical test theory item discrimination estimators in monte carlo simulations. Asia Pacific Education Review, 18(4), 585–598.

    Article  Google Scholar 

  • Beck, J.E. (2002). Directing development effort with simulated students. In Proceedings of Intelligent Tutoring Systems, pp 851–860.

  • Bellomo, N., & Dogbe, C. (2011). On the modeling of traffic and crowds: A survey of models, speculations, and perspectives. SIAM Review, 53(3), 409–463.

    Article  MathSciNet  Google Scholar 

  • Bengs, D. & Brefeld, U. (2014). Computer-based adaptive speed tests. In J. C. Stamper, Z. A. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, London, UK, July 4-7, 2014, pp. 221–224 International Educational Data Mining Society (IEDMS).

  • Bergner, Y., Dröschler, S., Kortemeyer, G., Rayyan, S., Seaton, D. T., & Pritchard, D. E. (2012). Model-based collaborative filtering analysis of student response data: Machine-learning item response theory. In K. Yacef, O. R. Zaïane, A. Hershkovitz, M. Yudelson, & J. C. Stamper (Eds.) , Proceedings of the 5th International Conference on Educational Data Mining, Chania, Greece, June 19-21, 2012, pp. 95–102. www.educationaldatamining.org

  • Boel, R., & Mihaylova, L. (2006). A compositional stochastic model for real time freeway traffic simulation. Transportation Research Part B: Methodological, 40(4), 319–334.

    Article  Google Scholar 

  • Borjigin, A., Miao, C., Lim, S. F., Li, S., & Shen, Z. (2015). Teachable agents with intrinsic motivation. In C. Conati, N. Heffernan, A. Mitrovic, & M. F. Verdejo (Eds.), Artificial Intelligence in Education (pp. 34–43). Cham. Springer International Publishing.

    Chapter  Google Scholar 

  • Botelho, A. F., Adjei, S., & Heffernan, N. T. (2016). Modeling interactions across skills: A method to construct and compare models predicting the existence of skill relationships. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016, pp. 292–297. International Educational Data Mining Society (IEDMS)

  • Briggs, D. C., & Circi, R. (2017). Challenges to the use of artificial neural networks for diagnostic classifications with student test data. International Journal of Testing, 17(4), 302–321.

    Article  Google Scholar 

  • Bringula, R. P., Basa, R. S., Cruz, C. D., & Rodrigo, M. M. T. (2016). Effects of prior knowledge in mathematics on learner-interface interactions in a learning-by-teaching intelligent tutoring system. Journal of Educational Computing Research, 54(4), 462–482.

    Article  Google Scholar 

  • Brodland, G. W. (2015). How computational models can help unlock biological systems. Seminars in Cell & Developmental Biology, Coding and non-coding RNAs & Mammalian development, 47–48, 62–73.

    Google Scholar 

  • Brown, J. & Eskenazi, M. (2006). Using simulated students for the assessment of authentic document retrieval. In M. Ikeda, K. D. Ashley, & T.-W. Chan (Eds.), Intelligent Tutoring Systems, pp. 685–688

  • Burer, S., & Piccialli, V. (2019). Three methods for robust grading. European Journal of Operational Research, 272(1), 364–371.

    Article  MathSciNet  Google Scholar 

  • Calderón, A., Boubeta-Puig, J., & Ruiz, M. (2018). Medit4cep-gam: A model-driven approach for user-friendly gamification design, monitoring and code generation in cep-based systems. Information and Software Technology, 95, 238–264.

    Article  Google Scholar 

  • Carlson, R., Keiser, V., Matsuda, N., Koedinger, K. R., & Penstein Rosé, C. (2012). Building a conversational simstudent. In S. A. Cerri, W. J. Clancey, G. Papadourakis, & K. Panourgia (Eds.), Intelligent Tutoring Systems (pp. 563–569). Heidelberg, Springer: Berlin.

    Chapter  Google Scholar 

  • Cascante, M., Boros, L. G., Comin-Anduix, B., de Atauri, P., Centelles, J. J., & Lee, P.W.-N. (2002). Metabolic control analysis in drug discovery and disease. Nature Biotechnology, 20(3), 243–249.

    Article  Google Scholar 

  • Castellano, K. E., & Ho, A. D. (2013). Contrasting ols and quantile regression approaches to student “growth’’ percentiles. Journal of Educational and Behavioral Statistics, 38(2), 190–215.

    Article  Google Scholar 

  • Castellano, K. E., & Ho, A. D. (2015). Practical differences among aggregate-level conditional status metrics: From median student growth percentiles to value-added models. Journal of Educational and Behavioral Statistics, 40(1), 35–68.

    Article  Google Scholar 

  • Chambers, S. (2016). Regression discontinuity design: a guide for strengthening causal inference in hrd. European Journal of Training and Development

  • Champaign, J. & Cohen, R. (2010). A multiagent, ecological approach to content sequencing. In Proceedings of AAMAS, pp. 10–4

  • Chaplot, D. S., MacLellan, C., Salakhutdinov, R., & Koedinger, K. (2018). Learning cognitive models using neural networks. In C. Penstein Rosé, R. Martínez-Maldonado, H. U. Hoppe, R. Luckin, M. Mavrikis, K. Porayska-Pomsta, B. McLaren, & B. du Boulay (Eds.), Artificial Intelligence in Education, pp. 43–56, Cham Springer International Publishing.

  • Chen, Y., González-Brenes, J. P., & Tian, J. (2016). Joint discovery of skill prerequisite graphs and student models. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016, pp. 46–53 International Educational Data Mining Society (IEDMS)

  • Chen, Y., Wuillemin, P., & Labat, J. (2015). Discovering prerequisite structure of skills through probabilistic association rules mining. In O. C. Santos, J. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, M. C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. C. Desmarais (Eds.), Proceedings of the 8th International Conference on Educational Data Mining, EDM 2015, Madrid, Spain, June 26-29, 2015, pp. 117–124. International Educational Data Mining Society (IEDMS)

  • Clement, B., Oudeyer, P., & Lopes, M. (2016). A comparison of automatic teaching strategies for heterogeneous student populations. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016, pp. 330–335. International Educational Data Mining Society (IEDMS)

  • Clement, B., Roy, D., Oudeyer, P., & Lopes, M. (2015). Multi-armed bandits for intelligent tutoring systems. In O. C. Santos, J. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, M. C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. C. Desmarais (Eds.), Proceedings of the 8th International Conference on Educational Data Mining, EDM 2015, Madrid, Spain, June 26-29, 2015, pp. 21. International Educational Data Mining Society (IEDMS)

  • Conati, C., Fratamico, L., Kardan, S., and Roll, I. (2015). Comparing representations for learner models in interactive simulations. In International Conference on Artificial Intelligence in Education, pp. 74–83. Springer

  • Cramman, H., Gott, S., Little, J., Merrell, C., Tymms, P., & Copping, L. T. (2020). Number identification: a unique developmental pathway in mathematics? Research Papers in Education, 35(2), 117–143.

    Article  Google Scholar 

  • Crowston, K., Østerlund, C., Lee, T. K., Jackson, C., Harandi, M., Allen, S., Bahaadini, S., Coughlin, S., Katsaggelos, A. K., Larson, S. L., et al. (2019). Knowledge tracing to model learning in online citizen science projects. IEEE Transactions on Learning Technologies, 13(1), 123–134.

    Article  Google Scholar 

  • Cui, Y., Chu, M.-W., & Chen, F. (2019). Analyzing student process data in game-based assessments with bayesian knowledge tracing and dynamic bayesian networks. Journal of Educational Data Mining, 11(1), 80–100.

    Google Scholar 

  • Cui, Y., Gierl, M. J., & Chang, H.-H. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49(1), 19–38.

    Article  Google Scholar 

  • De La Torre, J. (2011). The generalized dina model framework. Psychometrika, 76(2), 179–199.

    Article  MathSciNet  Google Scholar 

  • Deale, D., & Pastore, R. (2014). Evaluation of simschool: An instructional simulation for pre-service teachers. Computers in the Schools, 31(3), 197–219.

    Article  Google Scholar 

  • Debeer, D., Janssen, R., & De Boeck, P. (2017). Modeling skipped and not-reached items using irtrees. Journal of Educational Measurement, 54(3), 333–363.

    Article  Google Scholar 

  • DeMars, C. E. (2020). Multilevel rasch modeling: Does misfit to the rasch model impact the regression model? The Journal of Experimental Education, 88(4), 605–619.

    Article  Google Scholar 

  • Desmarais, M. C. (2011). Conditions for effectively deriving a q-matrix from data with non-negative matrix factorization. best paper award. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, & J. C. Stamper (Eds.), Proceedings of the 4th International Conference on Educational Data Mining, Eindhoven, The Netherlands, July 6-8, 2011, pp. 41–50 www.educationaldatamining.org

  • Desmarais, M. C. & Pelczer, I. (2010). On the faithfulness of simulated student performance data. In R. S. J. de Baker, A. Merceron, & P. I. P. Jr.(Eds.) Educational Data Mining 2010, The 3rd International Conference on Educational Data Mining, Pittsburgh, PA, USA, June 11-13, 2010. Proceedings, pp. 21–30. www.educationaldatamining.org.

  • Dickison, D., Ritter, S., Nixon, T., Harris, T. K., Towle, B., Murray, R. C., & Hausmann, R. G. M. (2010b). Predicting the effects of skill model changes on student progress. In Proceedings of Intelligent Tutoring Systems, pp. 300–302

  • Dimitrov, D. M. (2020). Modeling of item response functions under the d-scoring method. Educational and Psychological Measurement, 80(1), 126–144.

    Article  Google Scholar 

  • Ding, X. & Larson, E. C. (2019). Why deep knowledge tracing has less depth than anticipated. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds.), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS)

  • Dorça, F. A. (2015). Implementation and use of Simulated Students for Test and Validation of new Adaptive Educational Systems: a Practical Insight. International Journal of Artificial Intelligence in Education (IJAIED), 25319–345

  • Durán, E. B., & Amandi, A. (2011). Personalised collaborative skills for student models. Interactive Learning Environments, 19(2), 143–162.

    Article  Google Scholar 

  • Ebert, R. (2011). Remaking my voice. Ted Talk. [Accessed: 2020 06 01]

  • Erickson, G., Frost, S., Bateman, S., & McCalla, G. (2013). Using the ecological approach to create simulations of learning environments. In Artificial Intelligence in Education, pp. 411–420

  • Fancsali, S., Nixon, T., & Ritter, S. (2013a). Optimal and Worst-Case Performance of Mastery Learning Assessment with Bayesian Knowledge Tracing. In Proceedings of EDM

  • Fancsali, S. E., Nixon, T., Vuong, A., & Ritter, S. (2013b). Simulated Students, Mastery Learning, and Improved Learning Curves for Real-World Cognitive Tutors. In Proceedings of AIED Workshops

  • Faucon, L., Kidzinski, L., & Dillenbourg, P. (2016). Semi-Markov model for simulating MOOC students. In Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, pp. 358–363

  • Feigenbaum, E. A. (2003). Some challenges and grand challenges for computational intelligence. J. ACM, 50(1), 32–40.

    Article  MathSciNet  Google Scholar 

  • Feuerstahler, L., & Wilson, M. (2019). Scale alignment in between-item multidimensional rasch models. Journal of Educational Measurement, 56(2), 280–301.

    Article  Google Scholar 

  • Fitzpatrick, J., & Skorupski, W. P. (2016). Equating with miditests using irt. Journal of Educational Measurement, 53(2), 172–189.

    Article  Google Scholar 

  • Fletcher, J. (2009). Education and training technology in the military. Science, 323(5910), 72–75.

    Article  Google Scholar 

  • Folsom-Kovarik, J. T., Sukthankar, G., & Schatz, S. (2013). Tractable pomdp representations for intelligent tutoring systems. ACM Trans. Intell. Syst. Technol., 4(2)

  • Frost, S. & McCalla, G. (2013). Exploring through Simulation the Effects of Peer Impact on Learning. In Proceedings of AIED Workshops

  • Frost, S. & McCalla, G. (2015). Exploring Through Simulation an Instructional Planner for Dynamic Open-Ended Learning Environments. In C. Conati, N. Heffernan, A. Mitrovic, & M. F. Verdejo (Eds.), Proceedings of AIED, pp. 578–581

  • Gibson, D. (2013). Assessing teaching skills with a mobile simulation. Journal of Digital Learning in Teacher Education, 30(1), 4–10.

    Article  Google Scholar 

  • González-Brenes, J. P. & Huang, Y. (2015b). Using Data from Real and Simulated Learners to Evaluate Adaptive Tutoring Systems. In Proceedings of AIED Workshops

  • González-Brenes, J. P. & Mostow, J. (2012). Dynamic cognitive tracing: Towards unified discovery of student and cognitive models. In K. Yacef, O. R. Zaïane, A. Hershkovitz, M. Yudelson, & J. C. Stamper (Eds.), Proceedings of the 5th International Conference on Educational Data Mining, Chania, Greece, June 19-21, 2012, pp. 49–56. www.educationaldatamining.org

  • Govindarajan, K., Kumar, V. S., Boulanger, D., & Kinshuk (2015). Learning analytics solution for reducing learners’ course failure rate. In 2015 IEEE Seventh International Conference on Technology for Education (T4E), pp. 83–90

  • Gu, J., Cai, H., & Beck, J. E. (2014). Investigate Performance of Expected Maximization on the Knowledge Tracing Model. In Proceedings of ITS, pp. 156–161

  • Guarino, C. M., Reckase, M. D., & Wooldridge, J. M. (2015). Can value-added measures of teacher performance be trusted? Education Finance and Policy, 10(1), 117–156.

    Article  Google Scholar 

  • Guarino, C. M., Stacy, B. W., & Wooldridge, J. M. (2019). Comparing and assessing the consequences of two different approaches to measuring school effectiveness. Educational Assessment, Evaluation and Accountability, 31(4), 437–463.

    Article  Google Scholar 

  • Harel, D. (2005). A turing-like test for biological modeling. Nature biotechnology, 23, 495–6.

    Article  Google Scholar 

  • Heliövaara, S., Korhonen, T., Hostikka, S., & Ehtamo, H. (2012). Counterflow model for agent-based simulation of crowd dynamics. Building and Environment, 48, 89–100.

    Article  Google Scholar 

  • Hernando, M., Guzmán, E., & Conejo, R. (2013). Validating item response theory models in simulated environments. In Proceedings of the AIED Workshop on Simulated Learners, pp. 41–50

  • Hingston, P. (2009). A turing test for computer game bots. IEEE Transactions on Computational Intelligence and AI in Games, 1(3), 169–186.

    Article  Google Scholar 

  • Hintze, J. M., Wells, C. S., Marcotte, A. M., & Solomon, B. G. (2018). Decision-making accuracy of cbm progress-monitoring data. Journal of Psychoeducational Assessment, 36(1), 74–81.

    Article  Google Scholar 

  • III, D. W., Harpstead, E., MacLellan, C. J., Rachatasumrit, N., & Koedinger, K. R. (2019). Toward near zero-parameter prediction using a computational model of student learning. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds.), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS)

  • Iman, S. & Joshi, S. (2007). The e hardware verification language.Springer Science & Business Media

  • Jr., P. I. P. & Wu, S. (2011). dynamical system model of microgenetic changes in performance, efficacy, strategy use and value during vocabulary learning. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, & J. C. Stamper (Eds.), Proceedings of the 4th International Conference on Educational Data Mining, Eindhoven, The Netherlands, July 6-8, 2011, pp. 277–282. www.educationaldatamining.org

  • Kaiser, J., Retelsdorf, J., & Südkamp, A., & Möller, J. (2013). Achievement and engagement: How student characteristics influence teacher judgments. Learning and Instruction, 28, 73–84.

  • Kalkan, Ö. K., Kelecioglu, H., & Basokçu, T. O. (2018). Comparison of cognitive diagnosis models under changing conditions: Dina, rdina, hodina and hordina. International Education Studies, 11(6), 119–131.

    Article  Google Scholar 

  • Kallonis, P. & Sampson, D. G. (2011). A 3d virtual classroom simulation for supporting school teachers training based on synectics - "making the strange familiar". In 2011 IEEE 11th International Conference on Advanced Learning Technologies, pp. 4–6

  • Khajah, M., Lindsey, R. V., & Mozer, M. (2016). How deep is knowledge tracing? In T. Barnes, M. Chi, & M. Feng (Eds), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016. International Educational Data Mining Society (IEDMS)

  • Khodeir, N., Wanas, N., Darwish, N., & Hegazy, N. (2014). Bayesian based adaptive question generation technique. Journal of Electrical Systems and Information Technology, 1(1), 10–16.

    Article  Google Scholar 

  • Kim, S. Y., & Lee, W.-C. (2019). Classification consistency and accuracy for mixed-format tests. Applied Measurement in Education, 32(2), 97–115.

    Article  Google Scholar 

  • Kitano, H. (2002). Computational systems biology. Nature, 420(6912), 206–210.

    Article  Google Scholar 

  • Kitchen, N. & Kuehlmann, A. (2007). Stimulus generation for constrained random simulation. In 2007 IEEE/ACM International Conference on Computer-Aided Design, pp. 258–265

  • Kitchenham, B. & Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering

  • Klingler, S., Käser, T., Solenthaler, B., & Gross, M. (2015). On the Performance Characteristics of Latent-Factor and Knowledge Tracing Models. In Proceedings of EDM, pp. 37–44

  • Klingler, S., Käser, T., Solenthaler, B., & Gross, M. (2016). Temporally coherent clustering of student data. International Educational Data Mining Society

  • Knezek, G., Hopper, S. B., Christensen, R., Tyler-Wood, T., & Gibson, D. C. (2015). Assessing pedagogical balance in a simulated classroom environment. Journal of Digital Learning in Teacher Education, 31(4), 148–159.

    Article  Google Scholar 

  • Koçak, D. (2020). The effect of chance success on equalization error in test equation based on classical test theory. International Journal of Progressive Education, 16(2)

  • Koedinger, K. R., Matsuda, N., MacLellan, C. J., & McLaughlin, E. A. (2015). Methods for Evaluating Simulated Learners: Examples from SimStudent. In Proceedings of AIED Workshops

  • Kopp, J. P., & Jones, A. T. (2020). Impact of item parameter drift on rasch scale stability in small samples over multiple administrations. Applied Measurement in Education, 33(1), 24–33.

    Article  Google Scholar 

  • KÖSE, İ. A. (2014). Assessing model data fit of unidimensional item response theory models in simulated data. Educational Research and Reviews, 9(17), 642–649.

  • Kurtz, M. D. (2018). Value-added and student growth percentile models: What drives differences in estimated classroom effects? Statistics and Public Policy, 5(1), 1–8.

    Article  Google Scholar 

  • Labutov, I. & Studer, C. (2016). Calibrated self-assessment. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016. International Educational Data Mining Society (IEDMS)

  • LaHuis, D. M., Bryant-Lees, K. B., Hakoyama, S., Barnes, T., & Wiemann, A. (2018). A comparison of procedures for estimating person reliability parameters in the graded response model. Journal of Educational Measurement, 55(3), 421–432.

    Article  Google Scholar 

  • Lateef, F. (2010). Simulation-based learning: Just like the real thing. Journal of emergencies, trauma, and shock, 3, 348–52.

    Article  Google Scholar 

  • Lee, C.-S., Wang, M.-H., & Huang, C.-H. (2015). Performance verification mechanism for adaptive assessment e-platform and e-navigation application. International Journal of e-Navigation and Maritime Economy, 2, 47–62.

    Article  Google Scholar 

  • Lee, G., & Lee, W.-C. (2016). Bi-factor mirt observed-score equating for mixed-format tests. Applied Measurement in Education, 29(3), 224–241.

    Article  Google Scholar 

  • Leelawong, K., & Biswas, G. (2008). Designing learning by teaching agents: The betty’s brain system. Int. J. Artif. Intell. Ed. (IJAIED), 18(3), 181–208.

    Google Scholar 

  • Lelei, D. & McCalla, G. (2018a). The role of simulation in the development of mentoring technology to support longer-term learning. In Proceedings of AIED Workshops

  • Lelei, D. & McCalla, G. (2019). How Many Times Should a Pedagogical Agent Simulation Model Be Run? In Proceedings of AIED, pages 182–193.

  • Lelei, D. E. K. & McCalla, G. (2018b). How to use simulation in the design and evaluation of learning environments with self-directed longer-term learners. In C. Penstein Rosé, R. Martínez-Maldonado, H. U. Hoppe, R. Luckin, M. Mavrikis, K. Porayska-Pomsta, B. McLaren, & B. du Boulay (Eds.), Artificial Intelligence in Education, pp. 253–266, Cham. Springer International Publishing

  • Lelei, D. E. K. & McCalla, G. (2018c). How to Use Simulation in the Design and Evaluation of Learning Environments with Self-directed Longer-Term Learners. In Proceedings of AIED

  • Lenat, D. B. & Durlach, P. J. (2014). Reinforcing math knowledge by immersing students in a simulated learning-by-teaching experience. International Journal of Artificial Intelligence in Education, 43):216–250

  • Levy, R. (2019). Dynamic bayesian network modeling of game-based diagnostic assessments. Multivariate Behavioral Research, 54(6), 771–794.

    Article  Google Scholar 

  • Li, N., Cohen, W. W., & Koedinger, K. R. (2012). Efficient cross-domain learning of complex skills. In S. A. Cerri, W. J. Clancey, G. Papadourakis, & K. Panourgia (Eds.), Intelligent Tutoring Systems (pp. 493–498). Heidelberg Springer: Berlin.

    Chapter  Google Scholar 

  • Li, N., Cohen, W. W., & Koedinger, K. R. (2012). Learning to perceive two-dimensional displays using probabilistic grammars. In P. A. Flach, T. De Bie, & N. Cristianini (Eds.), Machine Learning and Knowledge Discovery in Databases (pp. 773–788). Heidelberg Springer: Berlin.

    Chapter  Google Scholar 

  • Li, N., Cohen, W. W., & Koedinger, K. R. (2013). Problem order implications for learning. International Journal of Artificial Intelligence in Education, 23(1), 71–93.

    Article  Google Scholar 

  • Li, N., Cohen, W. W., Koedinger, K. R., & Matsuda, N. (2011). A machine learning approach for automatic student model discovery. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, & J. C. Stamper (Eds.), Proceedings of the 4th International Conference on Educational Data Mining, Eindhoven, The Netherlands, July 6-8, 2011, pp. 31–40. www.educationaldatamining.org.

  • Li, N., Matsuda, N., Cohen, W. W., & Koedinger, K. R. (2015). Integrating representation learning and skill learning in a human-like intelligent agent. Artificial Intelligence, 219, 67–91.

    Article  Google Scholar 

  • Li, N., Oyler, D. W., Zhang, M., Yildiz, Y., Kolmanovsky, I., & Girard, A. R. (2018). Game theoretic modeling of driver and vehicle interactions for verification and validation of autonomous vehicle control systems. IEEE Transactions on Control Systems Technology, 26(5), 1782–1797.

    Article  Google Scholar 

  • Li, N., Tian, Y., Cohen, W. W., & Koedinger, K. R. (2013). Integrating perceptual learning with external world knowledge in a simulated student. In H. C. Lane, K. Yacef, J. Mostow, & P. Pavlik (Eds.), Artificial Intelligence in Education (pp. 400–410). Heidelberg, Springer: Berlin.

    Chapter  Google Scholar 

  • Li, Z. (2014). Power and sample size calculations for logistic regression tests for differential item functioning. Journal of Educational Measurement, 51(4), 441–462.

    Article  Google Scholar 

  • Li, Z., Yee, L., Sauerberg, N., Sakson, I., Williams, J. J., & Rafferty, A. N. (2020). Getting too personal (ized): The importance of feature choice in online adaptive algorithms. In Proceedings of EDM. International Educational Data Mining Society (IEDMS).

  • Lim, E., & Lee, W.-C. (2020). Subscore equating and profile reporting. Applied Measurement in Education, 33(2), 95–112.

    Article  Google Scholar 

  • Liu, Y., Mandel, T., Brunskill, E., & Popovic, Z. (2014). Trading off scientific knowledge and user learning with multi-armed bandits. In J. C. Stamper, Z. A. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, London, UK, July 4-7, 2014, pp. 161–168. International Educational Data Mining Society (IEDMS).

  • MacLellan, C. J., Harpstead, E., Patel, R., & Koedinger, K. R. (2016a). The Apprentice Learner architecture: Closing the loop between learning theory and educational data. In Proceedings of EDM

  • MacLellan, C. J., Harpstead, E., Patel, R., & Koedinger, K. R. (2016b). The Apprentice Learner architecture: Closing the loop between learning theory and educational data. In Proceedings of EDM

  • MacLellan, C. J., Koedinger, K. R., & Matsuda, N. (2014). Authoring tutors with simstudent: An evaluation of efficiency and model quality. In S. Trausan-Matu, K. E. Boyer, M. Crosby, & K. Panourgia (Eds.), Intelligent Tutoring Systems (pp. 551–60). Cham. Springer International Publishing.

    Chapter  Google Scholar 

  • MacLellan, C. J., Matsuda, N., & Koedinger, K. R. (2013). Toward a reflective SimStudent: Using experience to avoid generalization errors. In Proceedings of AIED workshops, pp. 51

  • Mahon, J., Bryant, B., Brown, B., & Kim, M. (2010). Using second life to enhance classroom management practice in teacher education. Educational Media International, 47(2), 121–134.

    Article  Google Scholar 

  • Marcoulides, K. M. (2018). Careful with those priors: A note on bayesian estimation in two-parameter logistic item response theory models. Measurement: Interdisciplinary Research and Perspectives, 16(2):92–99

  • Martinková, P., Drabinová, A., Liaw, Y.-L., Sanders, E. A., McFarland, J. L., & Price, R. M. (2017). Checking equity: Why differential item functioning analysis should be a routine part of developing conceptual assessments. CBE-Life Sciences Education, 16(2):rm2

  • Matsuda, N., Cohen, W., Sewall, J., Lacerda, G., & Koedinger, K. (2007). Predicting students’ performance with simstudent: Learning cognitive skills from observation. Frontiers in Artificial Intelligence and Applications, 158, 467–476.

    Google Scholar 

  • Matsuda, N., Cohen, W. W., & Koedinger, K. R. (2015). Teaching the teacher: tutoring SimStudent leads to more effective cognitive tutor authoring. International Journal of Artificial Intelligence in Education, 25(1), 1–34.

    Article  Google Scholar 

  • Matsuda, N., Cohen, W. W., Sewall, J., Lacerda, G., & Koedinger, K. R. (2007b). Evaluating a simulated student using real students data for training and testing. In Proceedings of User Modeling, pp. 107–116

  • Matsuda, N., Weng, W., & Wall, N. (2020). The effect of metacognitive scaffolding for learning by teaching a teachable agent. International Journal of Artificial Intelligence in Education, 30(1), 1–37.

    Article  Google Scholar 

  • Matsuda, N., Yarzebinski, E., Keiser, V., Raizada, R., Cohen, W. W., Stylianides, G. J., & Koedinger, K. R. (2013). Cognitive anatomy of tutor learning: Lessons learned with simstudent. Journal of Educational Psychology, 105(4), 1152–1163.

    Article  Google Scholar 

  • Matsuda, N., Yarzebinski, E., Keiser, V., Raizada, R., Stylianides, G. J., Cohen, W. W., & Koedinger, K. R. (2011b). Learning by teaching SimStudent–An initial classroom baseline study comparing with Cognitive Tutor. In Proceedings of AIED, pp. 213–221

  • Matsuda, N., Yarzebinski, E., Keiser, V., Raizada, R., Stylianides, G. J., & Koedinger, K. R. (2013). Studying the effect of a competitive game show in a learning by teaching environment. International Journal of Artificial Intelligence in Education, 23(1), 1–21.

    Article  Google Scholar 

  • Matusevych, Y., Alishahi, A., & Backus, A. (2016). Modelling verb selection within argument structure constructions. Language, Cognition and Neuroscience, 31(10), 1215–1244.

    Article  Google Scholar 

  • McCalla, G. I., & Champaign, J. (2013). Simulated Learners. IEEE Intelligent Systems, 28, 67–71.

    Article  Google Scholar 

  • McGuigan, M. (2006). Graphics turing test

  • McHugh, M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica: Biochemia medica, 22(3), 276–282.

    Article  MathSciNet  Google Scholar 

  • McPherson, R., Tyler-Wood, T., Ellison, A. M., & Peak, P. (2011). Using a computerized classroom simulation to prepare pre-service teachers. Journal of Technology and Teacher Education, 19(1), 93–110.

    Google Scholar 

  • Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological methods, 17(3), 437.

    Article  Google Scholar 

  • Menghini, C., Dehler Zufferey, J., & West, R. (2018). Compiling questions into balanced quizzes about documents. Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM ’18, pp. 1519–1522, New York, NY, USA. Association for Computing Machinery

  • Miciak, J., Taylor, W. P., Stuebing, K. K., Fletcher, J. M., & Vaughn, S. (2016). Designing intervention studies: Selected populations, range restrictions, and statistical power. Journal of research on educational effectiveness, 9(4), 556–569.

    Article  Google Scholar 

  • Monroe, S., & Cai, L. (2015). Examining the reliability of student growth percentiles using multidimensional irt. Educational Measurement: Issues and Practice, 34(4), 21–30.

    Article  Google Scholar 

  • Morris, S. B., Bass, M., Howard, E., & Neapolitan, R. E. (2020). Stopping rules for computer adaptive testing when item banks have nonuniform information. International Journal of Testing, 20(2), 146–168.

    Article  Google Scholar 

  • Mu, T., Jetten, A., & Brunskill, E. (2020). Towards suggesting actionable interventions for wheel-spinning students. In Proceedings of EDM. International Educational Data Mining Society (IEDMS)

  • Mussack, D., Flemming, R., Schrater, P., and Cardoso-Leite, P. (2019). Towards discovering problem similarity through deep learning: combining problem features and user behavior. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds.) Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS)

  • Naveh, Y., Rimon, M., Jaeger, I., Katz, Y., Vinov, M., Marcus, E., & Shurek, G. (2006). Constraint-based random stimuli generation for hardware verification. In Proceedings of the 18th Conference on Innovative Applications of Artificial Intelligence - Volume 2, IAAI’06, pp. 1720–1727. AAAI Press

  • Nazaretsky, T., Hershkovitz, S., & Alexandron, G. (2019b). Kappa learning: A new item-similarity method for clustering educational items from response data. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds.), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS).

  • Nissen, J., Donatello, R., & Van Dusen, B. (2019). Missing data and bias in physics education research: A case for using multiple imputation. Phys. Rev. Phys. Educ. Res., 15, 020106.

    Article  Google Scholar 

  • Ogan, A., Yarzebinski, E., De Roock, R., Dumdumaya, C., Banawan, M., & Rodrigo, M. M. (2017). Proficiency and preference using local language with a teachable agent. In E. André, R. Baker, X. Hu, M. M. T. Rodrigo, & B. du Boulay (Eds.), Artificial Intelligence in Education (pp. 548–552). Cham. Springer International Publishing.

    Chapter  Google Scholar 

  • Olivera-Aguilar, M., & Millsap, R. E. (2013). Statistical power for a simultaneous test of factorial and predictive invariance. Multivariate Behavioral Research, 48(1), 96–116.

    Article  Google Scholar 

  • Ozturk, A. O. (2012). A computer-assisted instruction in teaching abstract statistics to public affairs undergraduates. Journal of Political Science Education, 8(3), 251–257.

    Article  Google Scholar 

  • Page, R. L. (2000). Brief history of flight simulation. SimTecT 2000 Proceedings, pp. 11–17

  • Palmqvist, L., Kirkegaard, C., Silvervarg, A., Haake, M., & Gulz, A. (2015). The relationship between working memory capacity and students’ behaviour in a teachable agent-based software. In Proceedings of AIED, pp. 670–673

  • Pan, T., & Yin, Y. (2017). Using the bayes factors to evaluate person fit in the item response theory. Applied Measurement in Education, 30(3), 213–227.

    Article  Google Scholar 

  • Pardos, Z. A. & Heffernan, N. T. (2010). Navigating the parameter space of Bayesian Knowledge Tracing models: Visualizations of the convergence of the Expectation Maximization algorithm. In Proceedings of EDM, pp. 161–170

  • Pardos, Z. A., Wang, Q. Y., & Trivedi, S. (2012). The real world significance of performance prediction. In K. Yacef, O. R. Zaíane, A. Hershkovitz, M. Yudelson, & J. C.Stamper,(Eds.) Proceedings of the 5th InternationalConference on Educational Data Mining, Chania, Greece, June 19-21, 2012, pp 192–195. www.educationaldatamining.org

  • Pardos, Z. A. & Yudelson, M. V. (2013). Towards moment of learning accuracy. In Proceedings of AIED workshops

  • Pareto, L. (2014). A teachable agent game engaging primary school children to learn arithmetic concepts and reasoning. International Journal of Artificial Intelligence in Education, 24(3), 251–283.

    Article  Google Scholar 

  • Park, S., & Ryu, J. (2019). Exploring preservice teachers’ emotional experiences in an immersive virtual teaching simulation through facial expression recognition. International Journal of Human-Computer Interaction, 35(6), 521–533.

    Article  Google Scholar 

  • Parsons, E., Koedel, C., & Tan, L. (2019). Accounting for student disadvantage in value-added models. Journal of Educational and Behavioral Statistics, 44(2), 144–179.

    Article  Google Scholar 

  • Patarapichayatham, C., Kamata, A., & Kanjanawasee, S. (2012). Evaluation of model selection strategies for cross-level two-way differential item functioning analysis. Educational and Psychological Measurement, 72(1), 44–51.

    Article  Google Scholar 

  • Patikorn, T., Selent, D., Heffernan, N. T., Beck, J., & Zou, J. (2017). Using a single model trained across multiple experiments to improve the detection of treatment effects. In X. Hu, T. Barnes, A. Hershkovitz, & L. Paquette (Eds.), Proceedings of the 10th International Conference on Educational Data Mining, EDM 2017, Wuhan, Hubei, China, June 25-28, 2017. International Educational Data Mining Society (IEDMS).

  • Pavlik Jr, P. I. (2013). Mining the dynamics of student utility and strategy use during vocabulary learning. JEDM| Journal of Educational Data Mining, 5(1):39–71

  • Pearl, L. S. (2011). When unbiased probabilistic learning is not enough: Acquiring a parametric system of metrical phonology. Language Acquisition, 18(2), 87–120.

    Article  Google Scholar 

  • Pelánek, R. (2014). Application of time decay functions and the elo system in student modeling. In J. C. Stamper, Z. A. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, London, UK, July 4-7, 2014, pp. 21–27 International Educational Data Mining Society (IEDMS)

  • Pelánek, R. (2019). Measuring similarity of educational items: An overview. IEEE Transactions on Learning Technologies

  • Pelánek, R., Jarusek, P., & Klusácek, M. (2013). Modeling students’ learning and variability of performance in problem solving. In S. K. D’Mello, R. A. Calvo, & A. Olney (Eds.), Proceedings of the 6th International Conference on Educational Data Mining, Memphis, Tennessee, USA, July 6-9, 2013, pp. 256–259. International Educational Data Mining Society

  • Pelánek, R., & Řihák, J. (2018). Analysis and design of mastery learning criteria. New Review of Hypermedia and Multimedia, 24(3), 133–159.

    Article  Google Scholar 

  • Pelánek, R. & Řihák, J. (2017). Experimental Analysis of Mastery Learning Criteria. In Proceedings of UMAP, pp. 156–163

  • Pelánek, R., & Jarušek, P. (2015). Student modeling based on problem solving times. International Journal of Artificial Intelligence in Education, 25(4), 493–519.

    Article  Google Scholar 

  • Periathiruvadi, S., Tyler-Wood, T., Knezek, G., & Christensen, R. (2012). Simulating students with learning disabilities in virtual classrooms: A validation study. In P. Resta (Ed.), Proceedings of Society for Information Technology & Teacher Education International Conference 2012, pp. 2588–2595, Austin, Texas, USA

  • Pichette, F., Béland, S., Jolani, S., & Leśniewska, J. (2015). The handling of missing binary data in language research. Studies in Second Language Learning and Teaching, 5, 153–169.

    Article  Google Scholar 

  • Piech, C., Bumbacher, E., & Davis, R. (2020). Measuring ability-to-learn using parametric learning-gain functions. In Proceedings of EDM

  • Poitras, E., Doleck, T., Huang, L., Li, S., & Lajoie, S. (2017). Advancing teacher technology education using open-ended learning environments as research and training platforms. Australasian Journal of Educational Technology, 33(3)

  • Poitras, E. & Fazeli, N. (2016). Using an intelligent web browser for teacher professional development: Preliminary findings from simulated learners. In G. Chamblee, & L. Langub (Eds.), Proceedings of Society for Information Technology & Teacher Education International Conference 2016, pp. 3037–3041

  • Raborn, A. W., Leite, W. L., & Marcoulides, K. M. (2019). A comparison of automated scale short form selection strategies. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS).

  • Rafferty, A., Ying, H., & Williams, J. (2019). Statistical consequences of using multi-armed bandits to conduct adaptive educational experiments. JEDM| Journal of Educational Data Mining, 11(1):47–79

  • Rhemtulla, M., Jia, F., Wu, W., & Little, T. D. (2014). Planned missing designs to optimize the efficiency of latent growth parameter stimates. International Journal of Behavioral Development, 38(5), 423–434.

    Article  Google Scholar 

  • Rihák, J. & Pelánek, R. (2017). Measuring similarity of educational items using data on learners’ performance. In X. Hu, T. Barnes, A. Hershkovitz, & L. Paquette (Eds.), Proceedings of the 10th International Conference on Educational Data Mining, EDM 2017, Wuhan, Hubei, China, June 25-28, 2017. International Educational Data Mining Society (IEDMS)

  • Ritter, S., Harris, T. K., Nixon, T., Dickison, D., Murray, R. C., & Towle, B. (2009). Reducing the knowledge tracing space. In T. Barnes, M. C. Desmarais, C. Romero, & S. Ventura (Eds.), Proceedings of EDM, pp. 151–160

  • Robinson, K., Jahanian, K., & Reich, J. (2018). Using online practice spaces to investigate challenges in enacting principles of equitable computer science teaching. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education, SIGCSE ’18, pp. 882–887, New York, NY, USA Association for Computing Machinery.

  • Rupp, A. A. & van Rijn, P. W. (2018). Gdina and cdm packages in r. Measurement: Interdisciplinary Research and Perspectives, 16(1):71–77

  • Rutkowski, L. (2011). The impact of missing background data on subpopulation estimation. Journal of Educational Measurement, 48(3), 293–312.

    Article  Google Scholar 

  • Rutkowski, L. (2014). Sensitivity of achievement estimation to conditioning model misclassification. Applied Measurement in Education, 27(2), 115–132.

    Article  Google Scholar 

  • Sabourin, J. L., Rowe, J. P., Mott, B. W., & Lester, J. C. (2013). Considering alternate futures to classify off-task behavior as emotion self-regulation: A supervised learning approach. Journal of Educational Data Mining, 5(1), 9–38.

    Google Scholar 

  • Sauro, H. M., Harel, D., Kwiatkowska, M., Shaffer, C. A., Uhrmacher, A. M., Hucka, M., Mendes, P., Stromback, L., & Tyson, J. J. (2006). Challenges for modeling and simulation methods in systems biology. In Proceedings of the 2006 Winter Simulation Conference, pp. 1720–1730

  • Saygin, A. P., Cicekli, I., & Akman, V. (2000). Turing test: 50 years later. Minds and Machines: Journal for Artificial Intelligence, Philosophy and Cognitive Science, 10(4), 463–518.

    Article  Google Scholar 

  • Schatschneider, C., Wagner, R. K., Hart, S. A., & Tighe, E. L. (2016). Using simulations to investigate the longitudinal stability of alternative schemes for classifying and identifying children with reading disabilities. Scientific Studies of Reading, 20(1), 34–48.

    Article  Google Scholar 

  • Schweizer, K., Reiß, S., & Troche, S. (2019). Does the effect of a time limit for testing impair structural investigations by means of confirmatory factor models? Educational and psychological measurement, 79(1), 40–64.

    Article  Google Scholar 

  • Schwendimann, B. A., Rodriguez-Triana, M. J., Vozniuk, A., Prieto, L. P., Boroujeni, M. S., Holzer, A., Gillet, D., & Dillenbourg, P. (2016). Perceiving learning at a glance: A systematic literature review of learning dashboard research. IEEE Transactions on Learning Technologies, 10(1), 30–41.

    Article  Google Scholar 

  • Segal, A., Ben David, Y., Williams, J. J., Gal, K., & Shalom, Y. (2018). Combining difficulty ranking with multi-armed bandits to sequence educational content. In C. Penstein Rosé, R. Martínez-Maldonado, H. U Hoppe, R. Luckin, M. Mavrikis, K. Porayska-Pomsta, B. McLaren, & B. du Boulay (Eds), Artificial Intelligence in Education, pp. 317–321, Cham. Springer International Publishing

  • Shimada, A., Mouri, K., Taniguchi, Y., Ogata, H., Taniguchi, R.-i., & Konomi, S. (2019). Optimizing assignment of students to courses based on learning activity analytics. International Educational Data Mining Society

  • Shimmei, M. & Matsuda, N. (2020). Learning a policy primes quality control: Towards evidence-based automation of learning engineering. In Proceedings of EDM

  • Shulruf, B., Poole, P., Jones, P., & Wilkinson, T. (2015). The objective borderline method: a probabilistic method for standard setting. Assessment & Evaluation in Higher Education, 40(3), 420–438.

    Article  Google Scholar 

  • Si, Y., & Reiter, J. P. (2013). Nonparametric bayesian multiple imputation for incomplete categorical variables in large-scale assessment surveys. Journal of Educational and Behavioral Statistics, 38(5), 499–521.

    Article  Google Scholar 

  • Sjödén, B., Tärning, B., Pareto, L., & Gulz, A. (2011b). Transferring teaching to testing–an unexplored aspect of teachable agents. In Proceedings of AIED, pp. 337–344

  • Sobolev, B., Harel, D., Vasilakis, C., & Levy, A. (2008). Using the Statecharts paradigm for simulation of patient flow in surgical care. Health Care Management Science, 11(1), 79–86.

    Article  Google Scholar 

  • Socha, A., & DeMars, C. E. (2013). A note on specifying the guessing parameter in atfind and dimtest. Applied Psychological Measurement, 37(1), 87–92.

    Article  Google Scholar 

  • Spoon, K., Beemer, J., Whitmer, J. C., Fan, J., Frazee, J. P., Stronach, J., Bohonak, A. J., & Levine, R. A. (2016). Random forests for evaluating pedagogy and informing personalized learning. Journal of Educational Data Mining, 8(2), 20–50.

    Google Scholar 

  • Stamper, J. & Moore, S. (2019b). Exploring Teachable Humans and Teachable Agents: Human Strategies Versus Agent Policies and the Basis of Expertise. In S. Isotani, E. Millán, A. Ogan, P. Hastings, B. McLaren, & R. Luckin (Eds), Proceedings of AIED workshops, pp. 269–274

  • Sterrett, S. G. (2003). Turing’s Two Tests for Intelligence, pp. 79–97. Springer Netherlands, Dordrecht

  • Su, P.-H., Wu, C.-H., & Lee, L.-S. (2015). A recursive dialogue game for personalized computer-aided pronunciation training. IEEE/ACM Trans. Audio, Speech and Lang. Proc., 23(1):127–141

  • Sünbül, S. Ö. (2018). The impact of different missing data handling methods on dina model. International Journal of Evaluation and Research in Education, 7(1), 77–86.

    Google Scholar 

  • Sutherland, S., Davidmann, S., Flake, P., & Moorby, P. (2006). Systemverilog for design: A guide to using systemverilog for hardware design and modeling, vol. 2.

  • Sweet, S. J. & Rupp, A. A. (2012). Using the ecd framework to support evidentiary reasoning in the context of a simulation study for detecting learner differences in epistemic games. JEDM| Journal of Educational Data Mining, 4(1):183–223

  • Tendeiro, J. N., & Meijer, R. R. (2012). A cusum to detect person misfit: A discussion and some alternatives for existing procedures. Applied Psychological Measurement, 36(5), 420–442.

    Article  Google Scholar 

  • Thiessen, E. D., & Pavlik, P. I. (2016). Modeling the role of distributional information in children’s use of phonemic contrasts. Journal of Memory and Language, 88, 117–132.

    Article  Google Scholar 

  • Thompson, W. J., Clark, A. K., & Nash, B. (2019). Measuring the reliability of diagnostic mastery classifications at multiple levels of reporting. Applied Measurement in Education, 32(4), 298–309.

    Article  Google Scholar 

  • Toland, M. D. (2014). Practical guide to conducting an item response theory analysis. The Journal of Early Adolescence, 34(1), 120–151.

    Article  Google Scholar 

  • Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(October), 433–60.

    Article  MathSciNet  Google Scholar 

  • van Rijn, P. W., Sinharay, S., Haberman, S. J., & Johnson, M. S. (2016). Assessment of fit of item response theory models used in large-scale educational survey assessments. Large-scale Assessments in Education, 4(1), 10.

    Article  Google Scholar 

  • VanLehn, K., Ohlsson, S., & Nason, R. (1994). Applications of Simulated Students: An Exploration. International Journal of Artifical Intelligence in Education, 5(2), 135–175.

    Google Scholar 

  • von Ahn, L., Blum, M., Hopper, N. J., & Langford, J. (2003). Captcha: Using hard ai problems for security. In E. Biham (Ed.), Advances in Cryptology— EUROCRYPT 2003, pp. 294–311, Berlin, Heidelberg. Springer Berlin Heidelberg

  • Wang, F.-H. (2012). On extracting recommendation knowledge for personalized web-based learning based on ant colony optimization with segmented-goal and meta-control strategies. Expert Systems with Applications, 39(7), 6446–6453.

    Article  Google Scholar 

  • Wei, H., & Lin, J. (2015). Using out-of-level items in computerized adaptive testing. International Journal of Testing, 15(1), 50–70.

    Article  Google Scholar 

  • Weitekamp, D., Harpstead, E., & Koedinger, K. R. (2020a). An interaction design for machine teaching to develop ai tutors. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20, pp. 1-11, New York, NY, USA. Association for Computing Machinery

  • Weitekamp, D., Ye, Z., Rachatasumrit, N., Harpstead, E., & Koedinger, K. (2020b). Investigating differential error types between human and simulated learners. In International Conference on Artificial Intelligence in Education, pp. 586–597

  • Whalen, A., & Griffiths, T. L. (2017). Adding population structure to models of language evolution by iterated learning. Journal of Mathematical Psychology, 76, 1–6.

    Article  MathSciNet  Google Scholar 

  • Wieman, C. E., Adams, W. K., & Perkins, K. K. (2008). Phet: Simulations that enhance learning. Science, 322(5902), 682–683.

    Article  Google Scholar 

  • Wray, R. E. (2019). Enhancing simulated students with models of self-regulated learning. In Proceedings of Augmented Cognition, pp, 644–654

  • Wyse, A. E., & Albano, A. D. (2015). Considering the use of general and modified assessment items in computerized adaptive testing. Applied Measurement in Education, 28(2), 156–167.

    Article  Google Scholar 

  • Xue, K., Corinne, A., & Leite, W. (2020). Semi-supervised Learning Method for Adjusting Biased Item Difficulty Estimates Caused by Nonignorable Missingness under 2PL-IRT Model. Proceedings of EDM, pp. 715–719

  • Yang, J. S., & Zheng, X. (2018). Item response data analysis using stata item response theory package. Journal of Educational and Behavioral Statistics, 43(1), 116–129.

    Article  MathSciNet  Google Scholar 

  • Yao, L. (2013). Comparing the performance of five multidimensional cat selection procedures with different stopping rules. Applied Psychological Measurement, 37(1), 3–23.

    Article  MathSciNet  Google Scholar 

  • Yao, L. (2014). Multidimensional cat item selection methods for domain scores and composite scores with item exposure control and content constraints. Journal of Educational Measurement, 51(1), 18–38.

    Article  Google Scholar 

  • Yarzebinski, E., Dumdumaya, C., Rodrigo, M. M. T., Matsuda, N., & Ogan, A. (2017). Regional cultural differences in how students customize their avatars in technology-enhanced learning. In E. André, R. Baker, X. Hu, M. M. T. Rodrigo, & B. du Boulay (Eds.), Artificial Intelligence in Education (pp. 598–601). Cham. Springer International Publishing.

    Chapter  Google Scholar 

  • Yosef, G., Walko, R., Avisar, R., Tatarinov, F., Rotenberg, E., & Yakir, D. (2018). Large-scale semi-arid afforestation can enhance precipitation and carbon sequestration potential. Scientific Reports, 8(1), 996.

    Article  Google Scholar 

  • Zhang, Z. (2018). Designing cognitively diagnostic assessment for algebraic content knowledge and thinking skills. International Education Studies, 11(2), 106–117.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Ido Roll for providing valuable feedback on an early version of this paper, and to Lucas Ramirez for assisting with data visualizations. GA’s research was generously supported by the Estate of Emile Mimran and by the Maurice and Vivienne Wohl Biology Endowment. TK’s research was substantially funded by the Swiss State Secretariat for Education, Research and Innovation SERI.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giora Alexandron.

Ethics declarations

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Tanja Käser and Giora Alexandron contributed equally to this work.

Appendix

Appendix

Table 2 Coding of all included research papers (lit=literature, prev=previous, data=data-driven, theory=theory-driven)
Table 3 Categorization of venues into fields

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Käser, T., Alexandron, G. Simulated Learners in Educational Technology: A Systematic Literature Review and a Turing-like Test. Int J Artif Intell Educ 34, 545–585 (2024). https://doi.org/10.1007/s40593-023-00337-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40593-023-00337-2

Keywords