Abstract
Simulation is a powerful approach that plays a significant role in science and technology. Computational models that simulate learner interactions and data hold great promise for educational technology as well. Amongst others, simulated learners can be used for teacher training, for generating and evaluating hypotheses on human learning, for developing adaptive learning algorithms, for building virtual worlds in which students can practice collaboration skills with simulated pals, and for testing learning environments. This paper provides the first systematic literature review on simulated learners in the broad area of artificial intelligence in education and related fields, focusing on the decade 2010-19. We analyze the trends regarding the use of simulated learners in educational technology within this decade, the purposes for which simulated learners are being used, and how the validity of the simulated learners is assessed. We find that simulated learner models tend to represent only narrow aspects of student learning. And, surprisingly, we also find that almost half of the studies using simulated learners do not provide any evidence that their modeling addresses the most fundamental question in simulation design – is the model valid? This poses a threat to the reliability of results that are based on these models. Based on our findings, we propose that future research should focus on developing more complete simulated learner models. To validate these models, we suggest a standard and universal criterion, which is based on the lasting idea of Turing’s Test. We discuss the properties of this test and its potential to move the field of simulated learners forward.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abdi, S., Khosravi, H., Sadiq, S. W., & Gasevic, D. (2019). A multivariate elo-based learner model for adaptive educational systems. In M.C. Desmarais, C. F.Lynch, A. Merceron, & R. Nkambou,(Eds.), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS)
Alexandron, G., Yoo, L. Y., Ruipérez-Valiente, J. A., Lee, S., & Pritchard, D. E. (2019). Are mooc learning analytics results trustworthy? with fake learners, they might not be! International Journal of Artificial Intelligence in Education, 29, 484–506.
Arikan, Ç. A. (2018). The effect of mini and midi anchor tests on test equating. International Journal of Progressive Education, 14(2), 148–160.
Aşiret, S. & Sünbül, S.Ö. (2016). Investigating test equating methods in small samples through various factors. Educational Sciences: Theory & Practice, 16(2)
Badiee, F., & Kaufman, D. (2015). Design evaluation of a simulation for teacher education. Sage Open, 5(2), 2158244015592454.
Bartocci, E., & Lió, P. (2016). Computational modeling, formal analysis, and tools for systems biology. PLOS Computational Biology, 12(1), 1–22.
Bazaldua, D. A. L., Lee, Y.-S., Keller, B., & Fellers, L. (2017). Assessing the performance of classical test theory item discrimination estimators in monte carlo simulations. Asia Pacific Education Review, 18(4), 585–598.
Beck, J.E. (2002). Directing development effort with simulated students. In Proceedings of Intelligent Tutoring Systems, pp 851–860.
Bellomo, N., & Dogbe, C. (2011). On the modeling of traffic and crowds: A survey of models, speculations, and perspectives. SIAM Review, 53(3), 409–463.
Bengs, D. & Brefeld, U. (2014). Computer-based adaptive speed tests. In J. C. Stamper, Z. A. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, London, UK, July 4-7, 2014, pp. 221–224 International Educational Data Mining Society (IEDMS).
Bergner, Y., Dröschler, S., Kortemeyer, G., Rayyan, S., Seaton, D. T., & Pritchard, D. E. (2012). Model-based collaborative filtering analysis of student response data: Machine-learning item response theory. In K. Yacef, O. R. Zaïane, A. Hershkovitz, M. Yudelson, & J. C. Stamper (Eds.) , Proceedings of the 5th International Conference on Educational Data Mining, Chania, Greece, June 19-21, 2012, pp. 95–102. www.educationaldatamining.org
Boel, R., & Mihaylova, L. (2006). A compositional stochastic model for real time freeway traffic simulation. Transportation Research Part B: Methodological, 40(4), 319–334.
Borjigin, A., Miao, C., Lim, S. F., Li, S., & Shen, Z. (2015). Teachable agents with intrinsic motivation. In C. Conati, N. Heffernan, A. Mitrovic, & M. F. Verdejo (Eds.), Artificial Intelligence in Education (pp. 34–43). Cham. Springer International Publishing.
Botelho, A. F., Adjei, S., & Heffernan, N. T. (2016). Modeling interactions across skills: A method to construct and compare models predicting the existence of skill relationships. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016, pp. 292–297. International Educational Data Mining Society (IEDMS)
Briggs, D. C., & Circi, R. (2017). Challenges to the use of artificial neural networks for diagnostic classifications with student test data. International Journal of Testing, 17(4), 302–321.
Bringula, R. P., Basa, R. S., Cruz, C. D., & Rodrigo, M. M. T. (2016). Effects of prior knowledge in mathematics on learner-interface interactions in a learning-by-teaching intelligent tutoring system. Journal of Educational Computing Research, 54(4), 462–482.
Brodland, G. W. (2015). How computational models can help unlock biological systems. Seminars in Cell & Developmental Biology, Coding and non-coding RNAs & Mammalian development, 47–48, 62–73.
Brown, J. & Eskenazi, M. (2006). Using simulated students for the assessment of authentic document retrieval. In M. Ikeda, K. D. Ashley, & T.-W. Chan (Eds.), Intelligent Tutoring Systems, pp. 685–688
Burer, S., & Piccialli, V. (2019). Three methods for robust grading. European Journal of Operational Research, 272(1), 364–371.
Calderón, A., Boubeta-Puig, J., & Ruiz, M. (2018). Medit4cep-gam: A model-driven approach for user-friendly gamification design, monitoring and code generation in cep-based systems. Information and Software Technology, 95, 238–264.
Carlson, R., Keiser, V., Matsuda, N., Koedinger, K. R., & Penstein Rosé, C. (2012). Building a conversational simstudent. In S. A. Cerri, W. J. Clancey, G. Papadourakis, & K. Panourgia (Eds.), Intelligent Tutoring Systems (pp. 563–569). Heidelberg, Springer: Berlin.
Cascante, M., Boros, L. G., Comin-Anduix, B., de Atauri, P., Centelles, J. J., & Lee, P.W.-N. (2002). Metabolic control analysis in drug discovery and disease. Nature Biotechnology, 20(3), 243–249.
Castellano, K. E., & Ho, A. D. (2013). Contrasting ols and quantile regression approaches to student “growth’’ percentiles. Journal of Educational and Behavioral Statistics, 38(2), 190–215.
Castellano, K. E., & Ho, A. D. (2015). Practical differences among aggregate-level conditional status metrics: From median student growth percentiles to value-added models. Journal of Educational and Behavioral Statistics, 40(1), 35–68.
Chambers, S. (2016). Regression discontinuity design: a guide for strengthening causal inference in hrd. European Journal of Training and Development
Champaign, J. & Cohen, R. (2010). A multiagent, ecological approach to content sequencing. In Proceedings of AAMAS, pp. 10–4
Chaplot, D. S., MacLellan, C., Salakhutdinov, R., & Koedinger, K. (2018). Learning cognitive models using neural networks. In C. Penstein Rosé, R. Martínez-Maldonado, H. U. Hoppe, R. Luckin, M. Mavrikis, K. Porayska-Pomsta, B. McLaren, & B. du Boulay (Eds.), Artificial Intelligence in Education, pp. 43–56, Cham Springer International Publishing.
Chen, Y., González-Brenes, J. P., & Tian, J. (2016). Joint discovery of skill prerequisite graphs and student models. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016, pp. 46–53 International Educational Data Mining Society (IEDMS)
Chen, Y., Wuillemin, P., & Labat, J. (2015). Discovering prerequisite structure of skills through probabilistic association rules mining. In O. C. Santos, J. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, M. C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. C. Desmarais (Eds.), Proceedings of the 8th International Conference on Educational Data Mining, EDM 2015, Madrid, Spain, June 26-29, 2015, pp. 117–124. International Educational Data Mining Society (IEDMS)
Clement, B., Oudeyer, P., & Lopes, M. (2016). A comparison of automatic teaching strategies for heterogeneous student populations. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016, pp. 330–335. International Educational Data Mining Society (IEDMS)
Clement, B., Roy, D., Oudeyer, P., & Lopes, M. (2015). Multi-armed bandits for intelligent tutoring systems. In O. C. Santos, J. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, M. C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. C. Desmarais (Eds.), Proceedings of the 8th International Conference on Educational Data Mining, EDM 2015, Madrid, Spain, June 26-29, 2015, pp. 21. International Educational Data Mining Society (IEDMS)
Conati, C., Fratamico, L., Kardan, S., and Roll, I. (2015). Comparing representations for learner models in interactive simulations. In International Conference on Artificial Intelligence in Education, pp. 74–83. Springer
Cramman, H., Gott, S., Little, J., Merrell, C., Tymms, P., & Copping, L. T. (2020). Number identification: a unique developmental pathway in mathematics? Research Papers in Education, 35(2), 117–143.
Crowston, K., Østerlund, C., Lee, T. K., Jackson, C., Harandi, M., Allen, S., Bahaadini, S., Coughlin, S., Katsaggelos, A. K., Larson, S. L., et al. (2019). Knowledge tracing to model learning in online citizen science projects. IEEE Transactions on Learning Technologies, 13(1), 123–134.
Cui, Y., Chu, M.-W., & Chen, F. (2019). Analyzing student process data in game-based assessments with bayesian knowledge tracing and dynamic bayesian networks. Journal of Educational Data Mining, 11(1), 80–100.
Cui, Y., Gierl, M. J., & Chang, H.-H. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49(1), 19–38.
De La Torre, J. (2011). The generalized dina model framework. Psychometrika, 76(2), 179–199.
Deale, D., & Pastore, R. (2014). Evaluation of simschool: An instructional simulation for pre-service teachers. Computers in the Schools, 31(3), 197–219.
Debeer, D., Janssen, R., & De Boeck, P. (2017). Modeling skipped and not-reached items using irtrees. Journal of Educational Measurement, 54(3), 333–363.
DeMars, C. E. (2020). Multilevel rasch modeling: Does misfit to the rasch model impact the regression model? The Journal of Experimental Education, 88(4), 605–619.
Desmarais, M. C. (2011). Conditions for effectively deriving a q-matrix from data with non-negative matrix factorization. best paper award. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, & J. C. Stamper (Eds.), Proceedings of the 4th International Conference on Educational Data Mining, Eindhoven, The Netherlands, July 6-8, 2011, pp. 41–50 www.educationaldatamining.org
Desmarais, M. C. & Pelczer, I. (2010). On the faithfulness of simulated student performance data. In R. S. J. de Baker, A. Merceron, & P. I. P. Jr.(Eds.) Educational Data Mining 2010, The 3rd International Conference on Educational Data Mining, Pittsburgh, PA, USA, June 11-13, 2010. Proceedings, pp. 21–30. www.educationaldatamining.org.
Dickison, D., Ritter, S., Nixon, T., Harris, T. K., Towle, B., Murray, R. C., & Hausmann, R. G. M. (2010b). Predicting the effects of skill model changes on student progress. In Proceedings of Intelligent Tutoring Systems, pp. 300–302
Dimitrov, D. M. (2020). Modeling of item response functions under the d-scoring method. Educational and Psychological Measurement, 80(1), 126–144.
Ding, X. & Larson, E. C. (2019). Why deep knowledge tracing has less depth than anticipated. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds.), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS)
Dorça, F. A. (2015). Implementation and use of Simulated Students for Test and Validation of new Adaptive Educational Systems: a Practical Insight. International Journal of Artificial Intelligence in Education (IJAIED), 25319–345
Durán, E. B., & Amandi, A. (2011). Personalised collaborative skills for student models. Interactive Learning Environments, 19(2), 143–162.
Ebert, R. (2011). Remaking my voice. Ted Talk. [Accessed: 2020 06 01]
Erickson, G., Frost, S., Bateman, S., & McCalla, G. (2013). Using the ecological approach to create simulations of learning environments. In Artificial Intelligence in Education, pp. 411–420
Fancsali, S., Nixon, T., & Ritter, S. (2013a). Optimal and Worst-Case Performance of Mastery Learning Assessment with Bayesian Knowledge Tracing. In Proceedings of EDM
Fancsali, S. E., Nixon, T., Vuong, A., & Ritter, S. (2013b). Simulated Students, Mastery Learning, and Improved Learning Curves for Real-World Cognitive Tutors. In Proceedings of AIED Workshops
Faucon, L., Kidzinski, L., & Dillenbourg, P. (2016). Semi-Markov model for simulating MOOC students. In Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, pp. 358–363
Feigenbaum, E. A. (2003). Some challenges and grand challenges for computational intelligence. J. ACM, 50(1), 32–40.
Feuerstahler, L., & Wilson, M. (2019). Scale alignment in between-item multidimensional rasch models. Journal of Educational Measurement, 56(2), 280–301.
Fitzpatrick, J., & Skorupski, W. P. (2016). Equating with miditests using irt. Journal of Educational Measurement, 53(2), 172–189.
Fletcher, J. (2009). Education and training technology in the military. Science, 323(5910), 72–75.
Folsom-Kovarik, J. T., Sukthankar, G., & Schatz, S. (2013). Tractable pomdp representations for intelligent tutoring systems. ACM Trans. Intell. Syst. Technol., 4(2)
Frost, S. & McCalla, G. (2013). Exploring through Simulation the Effects of Peer Impact on Learning. In Proceedings of AIED Workshops
Frost, S. & McCalla, G. (2015). Exploring Through Simulation an Instructional Planner for Dynamic Open-Ended Learning Environments. In C. Conati, N. Heffernan, A. Mitrovic, & M. F. Verdejo (Eds.), Proceedings of AIED, pp. 578–581
Gibson, D. (2013). Assessing teaching skills with a mobile simulation. Journal of Digital Learning in Teacher Education, 30(1), 4–10.
González-Brenes, J. P. & Huang, Y. (2015b). Using Data from Real and Simulated Learners to Evaluate Adaptive Tutoring Systems. In Proceedings of AIED Workshops
González-Brenes, J. P. & Mostow, J. (2012). Dynamic cognitive tracing: Towards unified discovery of student and cognitive models. In K. Yacef, O. R. Zaïane, A. Hershkovitz, M. Yudelson, & J. C. Stamper (Eds.), Proceedings of the 5th International Conference on Educational Data Mining, Chania, Greece, June 19-21, 2012, pp. 49–56. www.educationaldatamining.org
Govindarajan, K., Kumar, V. S., Boulanger, D., & Kinshuk (2015). Learning analytics solution for reducing learners’ course failure rate. In 2015 IEEE Seventh International Conference on Technology for Education (T4E), pp. 83–90
Gu, J., Cai, H., & Beck, J. E. (2014). Investigate Performance of Expected Maximization on the Knowledge Tracing Model. In Proceedings of ITS, pp. 156–161
Guarino, C. M., Reckase, M. D., & Wooldridge, J. M. (2015). Can value-added measures of teacher performance be trusted? Education Finance and Policy, 10(1), 117–156.
Guarino, C. M., Stacy, B. W., & Wooldridge, J. M. (2019). Comparing and assessing the consequences of two different approaches to measuring school effectiveness. Educational Assessment, Evaluation and Accountability, 31(4), 437–463.
Harel, D. (2005). A turing-like test for biological modeling. Nature biotechnology, 23, 495–6.
Heliövaara, S., Korhonen, T., Hostikka, S., & Ehtamo, H. (2012). Counterflow model for agent-based simulation of crowd dynamics. Building and Environment, 48, 89–100.
Hernando, M., Guzmán, E., & Conejo, R. (2013). Validating item response theory models in simulated environments. In Proceedings of the AIED Workshop on Simulated Learners, pp. 41–50
Hingston, P. (2009). A turing test for computer game bots. IEEE Transactions on Computational Intelligence and AI in Games, 1(3), 169–186.
Hintze, J. M., Wells, C. S., Marcotte, A. M., & Solomon, B. G. (2018). Decision-making accuracy of cbm progress-monitoring data. Journal of Psychoeducational Assessment, 36(1), 74–81.
III, D. W., Harpstead, E., MacLellan, C. J., Rachatasumrit, N., & Koedinger, K. R. (2019). Toward near zero-parameter prediction using a computational model of student learning. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds.), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS)
Iman, S. & Joshi, S. (2007). The e hardware verification language.Springer Science & Business Media
Jr., P. I. P. & Wu, S. (2011). dynamical system model of microgenetic changes in performance, efficacy, strategy use and value during vocabulary learning. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, & J. C. Stamper (Eds.), Proceedings of the 4th International Conference on Educational Data Mining, Eindhoven, The Netherlands, July 6-8, 2011, pp. 277–282. www.educationaldatamining.org
Kaiser, J., Retelsdorf, J., & Südkamp, A., & Möller, J. (2013). Achievement and engagement: How student characteristics influence teacher judgments. Learning and Instruction, 28, 73–84.
Kalkan, Ö. K., Kelecioglu, H., & Basokçu, T. O. (2018). Comparison of cognitive diagnosis models under changing conditions: Dina, rdina, hodina and hordina. International Education Studies, 11(6), 119–131.
Kallonis, P. & Sampson, D. G. (2011). A 3d virtual classroom simulation for supporting school teachers training based on synectics - "making the strange familiar". In 2011 IEEE 11th International Conference on Advanced Learning Technologies, pp. 4–6
Khajah, M., Lindsey, R. V., & Mozer, M. (2016). How deep is knowledge tracing? In T. Barnes, M. Chi, & M. Feng (Eds), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016. International Educational Data Mining Society (IEDMS)
Khodeir, N., Wanas, N., Darwish, N., & Hegazy, N. (2014). Bayesian based adaptive question generation technique. Journal of Electrical Systems and Information Technology, 1(1), 10–16.
Kim, S. Y., & Lee, W.-C. (2019). Classification consistency and accuracy for mixed-format tests. Applied Measurement in Education, 32(2), 97–115.
Kitano, H. (2002). Computational systems biology. Nature, 420(6912), 206–210.
Kitchen, N. & Kuehlmann, A. (2007). Stimulus generation for constrained random simulation. In 2007 IEEE/ACM International Conference on Computer-Aided Design, pp. 258–265
Kitchenham, B. & Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering
Klingler, S., Käser, T., Solenthaler, B., & Gross, M. (2015). On the Performance Characteristics of Latent-Factor and Knowledge Tracing Models. In Proceedings of EDM, pp. 37–44
Klingler, S., Käser, T., Solenthaler, B., & Gross, M. (2016). Temporally coherent clustering of student data. International Educational Data Mining Society
Knezek, G., Hopper, S. B., Christensen, R., Tyler-Wood, T., & Gibson, D. C. (2015). Assessing pedagogical balance in a simulated classroom environment. Journal of Digital Learning in Teacher Education, 31(4), 148–159.
Koçak, D. (2020). The effect of chance success on equalization error in test equation based on classical test theory. International Journal of Progressive Education, 16(2)
Koedinger, K. R., Matsuda, N., MacLellan, C. J., & McLaughlin, E. A. (2015). Methods for Evaluating Simulated Learners: Examples from SimStudent. In Proceedings of AIED Workshops
Kopp, J. P., & Jones, A. T. (2020). Impact of item parameter drift on rasch scale stability in small samples over multiple administrations. Applied Measurement in Education, 33(1), 24–33.
KÖSE, İ. A. (2014). Assessing model data fit of unidimensional item response theory models in simulated data. Educational Research and Reviews, 9(17), 642–649.
Kurtz, M. D. (2018). Value-added and student growth percentile models: What drives differences in estimated classroom effects? Statistics and Public Policy, 5(1), 1–8.
Labutov, I. & Studer, C. (2016). Calibrated self-assessment. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016. International Educational Data Mining Society (IEDMS)
LaHuis, D. M., Bryant-Lees, K. B., Hakoyama, S., Barnes, T., & Wiemann, A. (2018). A comparison of procedures for estimating person reliability parameters in the graded response model. Journal of Educational Measurement, 55(3), 421–432.
Lateef, F. (2010). Simulation-based learning: Just like the real thing. Journal of emergencies, trauma, and shock, 3, 348–52.
Lee, C.-S., Wang, M.-H., & Huang, C.-H. (2015). Performance verification mechanism for adaptive assessment e-platform and e-navigation application. International Journal of e-Navigation and Maritime Economy, 2, 47–62.
Lee, G., & Lee, W.-C. (2016). Bi-factor mirt observed-score equating for mixed-format tests. Applied Measurement in Education, 29(3), 224–241.
Leelawong, K., & Biswas, G. (2008). Designing learning by teaching agents: The betty’s brain system. Int. J. Artif. Intell. Ed. (IJAIED), 18(3), 181–208.
Lelei, D. & McCalla, G. (2018a). The role of simulation in the development of mentoring technology to support longer-term learning. In Proceedings of AIED Workshops
Lelei, D. & McCalla, G. (2019). How Many Times Should a Pedagogical Agent Simulation Model Be Run? In Proceedings of AIED, pages 182–193.
Lelei, D. E. K. & McCalla, G. (2018b). How to use simulation in the design and evaluation of learning environments with self-directed longer-term learners. In C. Penstein Rosé, R. Martínez-Maldonado, H. U. Hoppe, R. Luckin, M. Mavrikis, K. Porayska-Pomsta, B. McLaren, & B. du Boulay (Eds.), Artificial Intelligence in Education, pp. 253–266, Cham. Springer International Publishing
Lelei, D. E. K. & McCalla, G. (2018c). How to Use Simulation in the Design and Evaluation of Learning Environments with Self-directed Longer-Term Learners. In Proceedings of AIED
Lenat, D. B. & Durlach, P. J. (2014). Reinforcing math knowledge by immersing students in a simulated learning-by-teaching experience. International Journal of Artificial Intelligence in Education, 43):216–250
Levy, R. (2019). Dynamic bayesian network modeling of game-based diagnostic assessments. Multivariate Behavioral Research, 54(6), 771–794.
Li, N., Cohen, W. W., & Koedinger, K. R. (2012). Efficient cross-domain learning of complex skills. In S. A. Cerri, W. J. Clancey, G. Papadourakis, & K. Panourgia (Eds.), Intelligent Tutoring Systems (pp. 493–498). Heidelberg Springer: Berlin.
Li, N., Cohen, W. W., & Koedinger, K. R. (2012). Learning to perceive two-dimensional displays using probabilistic grammars. In P. A. Flach, T. De Bie, & N. Cristianini (Eds.), Machine Learning and Knowledge Discovery in Databases (pp. 773–788). Heidelberg Springer: Berlin.
Li, N., Cohen, W. W., & Koedinger, K. R. (2013). Problem order implications for learning. International Journal of Artificial Intelligence in Education, 23(1), 71–93.
Li, N., Cohen, W. W., Koedinger, K. R., & Matsuda, N. (2011). A machine learning approach for automatic student model discovery. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, & J. C. Stamper (Eds.), Proceedings of the 4th International Conference on Educational Data Mining, Eindhoven, The Netherlands, July 6-8, 2011, pp. 31–40. www.educationaldatamining.org.
Li, N., Matsuda, N., Cohen, W. W., & Koedinger, K. R. (2015). Integrating representation learning and skill learning in a human-like intelligent agent. Artificial Intelligence, 219, 67–91.
Li, N., Oyler, D. W., Zhang, M., Yildiz, Y., Kolmanovsky, I., & Girard, A. R. (2018). Game theoretic modeling of driver and vehicle interactions for verification and validation of autonomous vehicle control systems. IEEE Transactions on Control Systems Technology, 26(5), 1782–1797.
Li, N., Tian, Y., Cohen, W. W., & Koedinger, K. R. (2013). Integrating perceptual learning with external world knowledge in a simulated student. In H. C. Lane, K. Yacef, J. Mostow, & P. Pavlik (Eds.), Artificial Intelligence in Education (pp. 400–410). Heidelberg, Springer: Berlin.
Li, Z. (2014). Power and sample size calculations for logistic regression tests for differential item functioning. Journal of Educational Measurement, 51(4), 441–462.
Li, Z., Yee, L., Sauerberg, N., Sakson, I., Williams, J. J., & Rafferty, A. N. (2020). Getting too personal (ized): The importance of feature choice in online adaptive algorithms. In Proceedings of EDM. International Educational Data Mining Society (IEDMS).
Lim, E., & Lee, W.-C. (2020). Subscore equating and profile reporting. Applied Measurement in Education, 33(2), 95–112.
Liu, Y., Mandel, T., Brunskill, E., & Popovic, Z. (2014). Trading off scientific knowledge and user learning with multi-armed bandits. In J. C. Stamper, Z. A. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, London, UK, July 4-7, 2014, pp. 161–168. International Educational Data Mining Society (IEDMS).
MacLellan, C. J., Harpstead, E., Patel, R., & Koedinger, K. R. (2016a). The Apprentice Learner architecture: Closing the loop between learning theory and educational data. In Proceedings of EDM
MacLellan, C. J., Harpstead, E., Patel, R., & Koedinger, K. R. (2016b). The Apprentice Learner architecture: Closing the loop between learning theory and educational data. In Proceedings of EDM
MacLellan, C. J., Koedinger, K. R., & Matsuda, N. (2014). Authoring tutors with simstudent: An evaluation of efficiency and model quality. In S. Trausan-Matu, K. E. Boyer, M. Crosby, & K. Panourgia (Eds.), Intelligent Tutoring Systems (pp. 551–60). Cham. Springer International Publishing.
MacLellan, C. J., Matsuda, N., & Koedinger, K. R. (2013). Toward a reflective SimStudent: Using experience to avoid generalization errors. In Proceedings of AIED workshops, pp. 51
Mahon, J., Bryant, B., Brown, B., & Kim, M. (2010). Using second life to enhance classroom management practice in teacher education. Educational Media International, 47(2), 121–134.
Marcoulides, K. M. (2018). Careful with those priors: A note on bayesian estimation in two-parameter logistic item response theory models. Measurement: Interdisciplinary Research and Perspectives, 16(2):92–99
Martinková, P., Drabinová, A., Liaw, Y.-L., Sanders, E. A., McFarland, J. L., & Price, R. M. (2017). Checking equity: Why differential item functioning analysis should be a routine part of developing conceptual assessments. CBE-Life Sciences Education, 16(2):rm2
Matsuda, N., Cohen, W., Sewall, J., Lacerda, G., & Koedinger, K. (2007). Predicting students’ performance with simstudent: Learning cognitive skills from observation. Frontiers in Artificial Intelligence and Applications, 158, 467–476.
Matsuda, N., Cohen, W. W., & Koedinger, K. R. (2015). Teaching the teacher: tutoring SimStudent leads to more effective cognitive tutor authoring. International Journal of Artificial Intelligence in Education, 25(1), 1–34.
Matsuda, N., Cohen, W. W., Sewall, J., Lacerda, G., & Koedinger, K. R. (2007b). Evaluating a simulated student using real students data for training and testing. In Proceedings of User Modeling, pp. 107–116
Matsuda, N., Weng, W., & Wall, N. (2020). The effect of metacognitive scaffolding for learning by teaching a teachable agent. International Journal of Artificial Intelligence in Education, 30(1), 1–37.
Matsuda, N., Yarzebinski, E., Keiser, V., Raizada, R., Cohen, W. W., Stylianides, G. J., & Koedinger, K. R. (2013). Cognitive anatomy of tutor learning: Lessons learned with simstudent. Journal of Educational Psychology, 105(4), 1152–1163.
Matsuda, N., Yarzebinski, E., Keiser, V., Raizada, R., Stylianides, G. J., Cohen, W. W., & Koedinger, K. R. (2011b). Learning by teaching SimStudent–An initial classroom baseline study comparing with Cognitive Tutor. In Proceedings of AIED, pp. 213–221
Matsuda, N., Yarzebinski, E., Keiser, V., Raizada, R., Stylianides, G. J., & Koedinger, K. R. (2013). Studying the effect of a competitive game show in a learning by teaching environment. International Journal of Artificial Intelligence in Education, 23(1), 1–21.
Matusevych, Y., Alishahi, A., & Backus, A. (2016). Modelling verb selection within argument structure constructions. Language, Cognition and Neuroscience, 31(10), 1215–1244.
McCalla, G. I., & Champaign, J. (2013). Simulated Learners. IEEE Intelligent Systems, 28, 67–71.
McGuigan, M. (2006). Graphics turing test
McHugh, M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica: Biochemia medica, 22(3), 276–282.
McPherson, R., Tyler-Wood, T., Ellison, A. M., & Peak, P. (2011). Using a computerized classroom simulation to prepare pre-service teachers. Journal of Technology and Teacher Education, 19(1), 93–110.
Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological methods, 17(3), 437.
Menghini, C., Dehler Zufferey, J., & West, R. (2018). Compiling questions into balanced quizzes about documents. Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM ’18, pp. 1519–1522, New York, NY, USA. Association for Computing Machinery
Miciak, J., Taylor, W. P., Stuebing, K. K., Fletcher, J. M., & Vaughn, S. (2016). Designing intervention studies: Selected populations, range restrictions, and statistical power. Journal of research on educational effectiveness, 9(4), 556–569.
Monroe, S., & Cai, L. (2015). Examining the reliability of student growth percentiles using multidimensional irt. Educational Measurement: Issues and Practice, 34(4), 21–30.
Morris, S. B., Bass, M., Howard, E., & Neapolitan, R. E. (2020). Stopping rules for computer adaptive testing when item banks have nonuniform information. International Journal of Testing, 20(2), 146–168.
Mu, T., Jetten, A., & Brunskill, E. (2020). Towards suggesting actionable interventions for wheel-spinning students. In Proceedings of EDM. International Educational Data Mining Society (IEDMS)
Mussack, D., Flemming, R., Schrater, P., and Cardoso-Leite, P. (2019). Towards discovering problem similarity through deep learning: combining problem features and user behavior. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds.) Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS)
Naveh, Y., Rimon, M., Jaeger, I., Katz, Y., Vinov, M., Marcus, E., & Shurek, G. (2006). Constraint-based random stimuli generation for hardware verification. In Proceedings of the 18th Conference on Innovative Applications of Artificial Intelligence - Volume 2, IAAI’06, pp. 1720–1727. AAAI Press
Nazaretsky, T., Hershkovitz, S., & Alexandron, G. (2019b). Kappa learning: A new item-similarity method for clustering educational items from response data. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds.), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS).
Nissen, J., Donatello, R., & Van Dusen, B. (2019). Missing data and bias in physics education research: A case for using multiple imputation. Phys. Rev. Phys. Educ. Res., 15, 020106.
Ogan, A., Yarzebinski, E., De Roock, R., Dumdumaya, C., Banawan, M., & Rodrigo, M. M. (2017). Proficiency and preference using local language with a teachable agent. In E. André, R. Baker, X. Hu, M. M. T. Rodrigo, & B. du Boulay (Eds.), Artificial Intelligence in Education (pp. 548–552). Cham. Springer International Publishing.
Olivera-Aguilar, M., & Millsap, R. E. (2013). Statistical power for a simultaneous test of factorial and predictive invariance. Multivariate Behavioral Research, 48(1), 96–116.
Ozturk, A. O. (2012). A computer-assisted instruction in teaching abstract statistics to public affairs undergraduates. Journal of Political Science Education, 8(3), 251–257.
Page, R. L. (2000). Brief history of flight simulation. SimTecT 2000 Proceedings, pp. 11–17
Palmqvist, L., Kirkegaard, C., Silvervarg, A., Haake, M., & Gulz, A. (2015). The relationship between working memory capacity and students’ behaviour in a teachable agent-based software. In Proceedings of AIED, pp. 670–673
Pan, T., & Yin, Y. (2017). Using the bayes factors to evaluate person fit in the item response theory. Applied Measurement in Education, 30(3), 213–227.
Pardos, Z. A. & Heffernan, N. T. (2010). Navigating the parameter space of Bayesian Knowledge Tracing models: Visualizations of the convergence of the Expectation Maximization algorithm. In Proceedings of EDM, pp. 161–170
Pardos, Z. A., Wang, Q. Y., & Trivedi, S. (2012). The real world significance of performance prediction. In K. Yacef, O. R. Zaíane, A. Hershkovitz, M. Yudelson, & J. C.Stamper,(Eds.) Proceedings of the 5th InternationalConference on Educational Data Mining, Chania, Greece, June 19-21, 2012, pp 192–195. www.educationaldatamining.org
Pardos, Z. A. & Yudelson, M. V. (2013). Towards moment of learning accuracy. In Proceedings of AIED workshops
Pareto, L. (2014). A teachable agent game engaging primary school children to learn arithmetic concepts and reasoning. International Journal of Artificial Intelligence in Education, 24(3), 251–283.
Park, S., & Ryu, J. (2019). Exploring preservice teachers’ emotional experiences in an immersive virtual teaching simulation through facial expression recognition. International Journal of Human-Computer Interaction, 35(6), 521–533.
Parsons, E., Koedel, C., & Tan, L. (2019). Accounting for student disadvantage in value-added models. Journal of Educational and Behavioral Statistics, 44(2), 144–179.
Patarapichayatham, C., Kamata, A., & Kanjanawasee, S. (2012). Evaluation of model selection strategies for cross-level two-way differential item functioning analysis. Educational and Psychological Measurement, 72(1), 44–51.
Patikorn, T., Selent, D., Heffernan, N. T., Beck, J., & Zou, J. (2017). Using a single model trained across multiple experiments to improve the detection of treatment effects. In X. Hu, T. Barnes, A. Hershkovitz, & L. Paquette (Eds.), Proceedings of the 10th International Conference on Educational Data Mining, EDM 2017, Wuhan, Hubei, China, June 25-28, 2017. International Educational Data Mining Society (IEDMS).
Pavlik Jr, P. I. (2013). Mining the dynamics of student utility and strategy use during vocabulary learning. JEDM| Journal of Educational Data Mining, 5(1):39–71
Pearl, L. S. (2011). When unbiased probabilistic learning is not enough: Acquiring a parametric system of metrical phonology. Language Acquisition, 18(2), 87–120.
Pelánek, R. (2014). Application of time decay functions and the elo system in student modeling. In J. C. Stamper, Z. A. Pardos, M. Mavrikis, & B. M. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, London, UK, July 4-7, 2014, pp. 21–27 International Educational Data Mining Society (IEDMS)
Pelánek, R. (2019). Measuring similarity of educational items: An overview. IEEE Transactions on Learning Technologies
Pelánek, R., Jarusek, P., & Klusácek, M. (2013). Modeling students’ learning and variability of performance in problem solving. In S. K. D’Mello, R. A. Calvo, & A. Olney (Eds.), Proceedings of the 6th International Conference on Educational Data Mining, Memphis, Tennessee, USA, July 6-9, 2013, pp. 256–259. International Educational Data Mining Society
Pelánek, R., & Řihák, J. (2018). Analysis and design of mastery learning criteria. New Review of Hypermedia and Multimedia, 24(3), 133–159.
Pelánek, R. & Řihák, J. (2017). Experimental Analysis of Mastery Learning Criteria. In Proceedings of UMAP, pp. 156–163
Pelánek, R., & Jarušek, P. (2015). Student modeling based on problem solving times. International Journal of Artificial Intelligence in Education, 25(4), 493–519.
Periathiruvadi, S., Tyler-Wood, T., Knezek, G., & Christensen, R. (2012). Simulating students with learning disabilities in virtual classrooms: A validation study. In P. Resta (Ed.), Proceedings of Society for Information Technology & Teacher Education International Conference 2012, pp. 2588–2595, Austin, Texas, USA
Pichette, F., Béland, S., Jolani, S., & Leśniewska, J. (2015). The handling of missing binary data in language research. Studies in Second Language Learning and Teaching, 5, 153–169.
Piech, C., Bumbacher, E., & Davis, R. (2020). Measuring ability-to-learn using parametric learning-gain functions. In Proceedings of EDM
Poitras, E., Doleck, T., Huang, L., Li, S., & Lajoie, S. (2017). Advancing teacher technology education using open-ended learning environments as research and training platforms. Australasian Journal of Educational Technology, 33(3)
Poitras, E. & Fazeli, N. (2016). Using an intelligent web browser for teacher professional development: Preliminary findings from simulated learners. In G. Chamblee, & L. Langub (Eds.), Proceedings of Society for Information Technology & Teacher Education International Conference 2016, pp. 3037–3041
Raborn, A. W., Leite, W. L., & Marcoulides, K. M. (2019). A comparison of automated scale short form selection strategies. In M. C. Desmarais, C. F. Lynch, A. Merceron, & R. Nkambou (Eds), Proceedings of the 12th International Conference on Educational Data Mining, EDM 2019, Montréal, Canada, July 2-5, 2019. International Educational Data Mining Society (IEDMS).
Rafferty, A., Ying, H., & Williams, J. (2019). Statistical consequences of using multi-armed bandits to conduct adaptive educational experiments. JEDM| Journal of Educational Data Mining, 11(1):47–79
Rhemtulla, M., Jia, F., Wu, W., & Little, T. D. (2014). Planned missing designs to optimize the efficiency of latent growth parameter stimates. International Journal of Behavioral Development, 38(5), 423–434.
Rihák, J. & Pelánek, R. (2017). Measuring similarity of educational items using data on learners’ performance. In X. Hu, T. Barnes, A. Hershkovitz, & L. Paquette (Eds.), Proceedings of the 10th International Conference on Educational Data Mining, EDM 2017, Wuhan, Hubei, China, June 25-28, 2017. International Educational Data Mining Society (IEDMS)
Ritter, S., Harris, T. K., Nixon, T., Dickison, D., Murray, R. C., & Towle, B. (2009). Reducing the knowledge tracing space. In T. Barnes, M. C. Desmarais, C. Romero, & S. Ventura (Eds.), Proceedings of EDM, pp. 151–160
Robinson, K., Jahanian, K., & Reich, J. (2018). Using online practice spaces to investigate challenges in enacting principles of equitable computer science teaching. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education, SIGCSE ’18, pp. 882–887, New York, NY, USA Association for Computing Machinery.
Rupp, A. A. & van Rijn, P. W. (2018). Gdina and cdm packages in r. Measurement: Interdisciplinary Research and Perspectives, 16(1):71–77
Rutkowski, L. (2011). The impact of missing background data on subpopulation estimation. Journal of Educational Measurement, 48(3), 293–312.
Rutkowski, L. (2014). Sensitivity of achievement estimation to conditioning model misclassification. Applied Measurement in Education, 27(2), 115–132.
Sabourin, J. L., Rowe, J. P., Mott, B. W., & Lester, J. C. (2013). Considering alternate futures to classify off-task behavior as emotion self-regulation: A supervised learning approach. Journal of Educational Data Mining, 5(1), 9–38.
Sauro, H. M., Harel, D., Kwiatkowska, M., Shaffer, C. A., Uhrmacher, A. M., Hucka, M., Mendes, P., Stromback, L., & Tyson, J. J. (2006). Challenges for modeling and simulation methods in systems biology. In Proceedings of the 2006 Winter Simulation Conference, pp. 1720–1730
Saygin, A. P., Cicekli, I., & Akman, V. (2000). Turing test: 50 years later. Minds and Machines: Journal for Artificial Intelligence, Philosophy and Cognitive Science, 10(4), 463–518.
Schatschneider, C., Wagner, R. K., Hart, S. A., & Tighe, E. L. (2016). Using simulations to investigate the longitudinal stability of alternative schemes for classifying and identifying children with reading disabilities. Scientific Studies of Reading, 20(1), 34–48.
Schweizer, K., Reiß, S., & Troche, S. (2019). Does the effect of a time limit for testing impair structural investigations by means of confirmatory factor models? Educational and psychological measurement, 79(1), 40–64.
Schwendimann, B. A., Rodriguez-Triana, M. J., Vozniuk, A., Prieto, L. P., Boroujeni, M. S., Holzer, A., Gillet, D., & Dillenbourg, P. (2016). Perceiving learning at a glance: A systematic literature review of learning dashboard research. IEEE Transactions on Learning Technologies, 10(1), 30–41.
Segal, A., Ben David, Y., Williams, J. J., Gal, K., & Shalom, Y. (2018). Combining difficulty ranking with multi-armed bandits to sequence educational content. In C. Penstein Rosé, R. Martínez-Maldonado, H. U Hoppe, R. Luckin, M. Mavrikis, K. Porayska-Pomsta, B. McLaren, & B. du Boulay (Eds), Artificial Intelligence in Education, pp. 317–321, Cham. Springer International Publishing
Shimada, A., Mouri, K., Taniguchi, Y., Ogata, H., Taniguchi, R.-i., & Konomi, S. (2019). Optimizing assignment of students to courses based on learning activity analytics. International Educational Data Mining Society
Shimmei, M. & Matsuda, N. (2020). Learning a policy primes quality control: Towards evidence-based automation of learning engineering. In Proceedings of EDM
Shulruf, B., Poole, P., Jones, P., & Wilkinson, T. (2015). The objective borderline method: a probabilistic method for standard setting. Assessment & Evaluation in Higher Education, 40(3), 420–438.
Si, Y., & Reiter, J. P. (2013). Nonparametric bayesian multiple imputation for incomplete categorical variables in large-scale assessment surveys. Journal of Educational and Behavioral Statistics, 38(5), 499–521.
Sjödén, B., Tärning, B., Pareto, L., & Gulz, A. (2011b). Transferring teaching to testing–an unexplored aspect of teachable agents. In Proceedings of AIED, pp. 337–344
Sobolev, B., Harel, D., Vasilakis, C., & Levy, A. (2008). Using the Statecharts paradigm for simulation of patient flow in surgical care. Health Care Management Science, 11(1), 79–86.
Socha, A., & DeMars, C. E. (2013). A note on specifying the guessing parameter in atfind and dimtest. Applied Psychological Measurement, 37(1), 87–92.
Spoon, K., Beemer, J., Whitmer, J. C., Fan, J., Frazee, J. P., Stronach, J., Bohonak, A. J., & Levine, R. A. (2016). Random forests for evaluating pedagogy and informing personalized learning. Journal of Educational Data Mining, 8(2), 20–50.
Stamper, J. & Moore, S. (2019b). Exploring Teachable Humans and Teachable Agents: Human Strategies Versus Agent Policies and the Basis of Expertise. In S. Isotani, E. Millán, A. Ogan, P. Hastings, B. McLaren, & R. Luckin (Eds), Proceedings of AIED workshops, pp. 269–274
Sterrett, S. G. (2003). Turing’s Two Tests for Intelligence, pp. 79–97. Springer Netherlands, Dordrecht
Su, P.-H., Wu, C.-H., & Lee, L.-S. (2015). A recursive dialogue game for personalized computer-aided pronunciation training. IEEE/ACM Trans. Audio, Speech and Lang. Proc., 23(1):127–141
Sünbül, S. Ö. (2018). The impact of different missing data handling methods on dina model. International Journal of Evaluation and Research in Education, 7(1), 77–86.
Sutherland, S., Davidmann, S., Flake, P., & Moorby, P. (2006). Systemverilog for design: A guide to using systemverilog for hardware design and modeling, vol. 2.
Sweet, S. J. & Rupp, A. A. (2012). Using the ecd framework to support evidentiary reasoning in the context of a simulation study for detecting learner differences in epistemic games. JEDM| Journal of Educational Data Mining, 4(1):183–223
Tendeiro, J. N., & Meijer, R. R. (2012). A cusum to detect person misfit: A discussion and some alternatives for existing procedures. Applied Psychological Measurement, 36(5), 420–442.
Thiessen, E. D., & Pavlik, P. I. (2016). Modeling the role of distributional information in children’s use of phonemic contrasts. Journal of Memory and Language, 88, 117–132.
Thompson, W. J., Clark, A. K., & Nash, B. (2019). Measuring the reliability of diagnostic mastery classifications at multiple levels of reporting. Applied Measurement in Education, 32(4), 298–309.
Toland, M. D. (2014). Practical guide to conducting an item response theory analysis. The Journal of Early Adolescence, 34(1), 120–151.
Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(October), 433–60.
van Rijn, P. W., Sinharay, S., Haberman, S. J., & Johnson, M. S. (2016). Assessment of fit of item response theory models used in large-scale educational survey assessments. Large-scale Assessments in Education, 4(1), 10.
VanLehn, K., Ohlsson, S., & Nason, R. (1994). Applications of Simulated Students: An Exploration. International Journal of Artifical Intelligence in Education, 5(2), 135–175.
von Ahn, L., Blum, M., Hopper, N. J., & Langford, J. (2003). Captcha: Using hard ai problems for security. In E. Biham (Ed.), Advances in Cryptology— EUROCRYPT 2003, pp. 294–311, Berlin, Heidelberg. Springer Berlin Heidelberg
Wang, F.-H. (2012). On extracting recommendation knowledge for personalized web-based learning based on ant colony optimization with segmented-goal and meta-control strategies. Expert Systems with Applications, 39(7), 6446–6453.
Wei, H., & Lin, J. (2015). Using out-of-level items in computerized adaptive testing. International Journal of Testing, 15(1), 50–70.
Weitekamp, D., Harpstead, E., & Koedinger, K. R. (2020a). An interaction design for machine teaching to develop ai tutors. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20, pp. 1-11, New York, NY, USA. Association for Computing Machinery
Weitekamp, D., Ye, Z., Rachatasumrit, N., Harpstead, E., & Koedinger, K. (2020b). Investigating differential error types between human and simulated learners. In International Conference on Artificial Intelligence in Education, pp. 586–597
Whalen, A., & Griffiths, T. L. (2017). Adding population structure to models of language evolution by iterated learning. Journal of Mathematical Psychology, 76, 1–6.
Wieman, C. E., Adams, W. K., & Perkins, K. K. (2008). Phet: Simulations that enhance learning. Science, 322(5902), 682–683.
Wray, R. E. (2019). Enhancing simulated students with models of self-regulated learning. In Proceedings of Augmented Cognition, pp, 644–654
Wyse, A. E., & Albano, A. D. (2015). Considering the use of general and modified assessment items in computerized adaptive testing. Applied Measurement in Education, 28(2), 156–167.
Xue, K., Corinne, A., & Leite, W. (2020). Semi-supervised Learning Method for Adjusting Biased Item Difficulty Estimates Caused by Nonignorable Missingness under 2PL-IRT Model. Proceedings of EDM, pp. 715–719
Yang, J. S., & Zheng, X. (2018). Item response data analysis using stata item response theory package. Journal of Educational and Behavioral Statistics, 43(1), 116–129.
Yao, L. (2013). Comparing the performance of five multidimensional cat selection procedures with different stopping rules. Applied Psychological Measurement, 37(1), 3–23.
Yao, L. (2014). Multidimensional cat item selection methods for domain scores and composite scores with item exposure control and content constraints. Journal of Educational Measurement, 51(1), 18–38.
Yarzebinski, E., Dumdumaya, C., Rodrigo, M. M. T., Matsuda, N., & Ogan, A. (2017). Regional cultural differences in how students customize their avatars in technology-enhanced learning. In E. André, R. Baker, X. Hu, M. M. T. Rodrigo, & B. du Boulay (Eds.), Artificial Intelligence in Education (pp. 598–601). Cham. Springer International Publishing.
Yosef, G., Walko, R., Avisar, R., Tatarinov, F., Rotenberg, E., & Yakir, D. (2018). Large-scale semi-arid afforestation can enhance precipitation and carbon sequestration potential. Scientific Reports, 8(1), 996.
Zhang, Z. (2018). Designing cognitively diagnostic assessment for algebraic content knowledge and thinking skills. International Education Studies, 11(2), 106–117.
Acknowledgements
The authors would like to thank Ido Roll for providing valuable feedback on an early version of this paper, and to Lucas Ramirez for assisting with data visualizations. GA’s research was generously supported by the Estate of Emile Mimran and by the Maurice and Vivienne Wohl Biology Endowment. TK’s research was substantially funded by the Swiss State Secretariat for Education, Research and Innovation SERI.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Tanja Käser and Giora Alexandron contributed equally to this work.
Appendix
Appendix
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Käser, T., Alexandron, G. Simulated Learners in Educational Technology: A Systematic Literature Review and a Turing-like Test. Int J Artif Intell Educ 34, 545–585 (2024). https://doi.org/10.1007/s40593-023-00337-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40593-023-00337-2