Exploring Explainability and Transparency in Automated Essay Scoring Systems: A User-Centered Evaluation

Hall, Erin; Seyam, Mohammed; Dunlap, Daniel

doi:10.1007/978-3-031-61691-4_18

Erin Hall²⁶,
Mohammed Seyam²⁶ &
Daniel Dunlap²⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14724))

Included in the following conference series:

International Conference on Human-Computer Interaction

370 Accesses

Abstract

In recent years, rapid advancements in computer science, including increased capabilities of machine learning models like Large Language Models (LLMs) and the accessibility of large datasets, have facilitated the widespread adoption of AI technology, underscoring the need to ethically design and evaluate these technologies with concern for their impact on students and teachers. Specifically, the rise of Automated Essay Scoring (AES) platforms have made it possible to provide real-time feedback and grades for student essays. Despite the increasing development and use of AES platforms, limited research has focused on AI explainability and algorithm transparency and their influence on the usability of these platforms. To address this gap, we conducted a qualitative study on an AI-based essay writing and grading platform, Packback Deep Dives, with a primary focus of exploring the experiences of students and graders. The study aimed to explore the system’s usability related to explainability and transparency and to uncover the resulting implications for users. Participants took part in surveys, semi-structured interviews, and a focus group. The findings reveal several important considerations for evaluating AES systems, including the clarity of feedback and explanations, effectiveness and actionability of feedback and explanations, perceptions and misconceptions of the system, evolving trust in AI judgments, user concerns and fairness perceptions, system efficiency and feedback quality, user interface accessibility and design, and system enhancement design priorities. These proposed key considerations can help guide the development of effective essay feedback and grading tools that prioritize explainability and transparency to improve usability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alshamari, M., Alsalem, T.: Usable AI: critical review of its current issues and trends. J. Comput. Sci. 19(3), 326–333 (2023). https://doi.org/10.3844/jcssp.2023.326.333, https://thescipub.com/abstract/jcssp.2023.326.333
Amershi, S., et al.: Guidelines for human-AI interaction (2019). https://doi.org/10.1145/3290605.3300233, https://doi-org.ezproxy.lib.vt.edu/10.1145/3290605.3300233
Barrett, A., Pack, A.: Not quite eye to A.I.: student and teacher perspectives on the use of generative artificial intelligence in the writing process. Int. J. Educ. Technol. High. Educ. 20(1), 59 (2023). https://doi.org/10.1186/s41239-023-00427-0
Bicknell, K., Brust, C., Settles, B.: How Duolingo’s AI learns what you need to learn: the language-learning app tries to emulate a great human tutor. IEEE Spectr. 60(3), 28–33 (2023). https://doi.org/10.1109/MSPEC.2023.10061631
Article Google Scholar
Brand, L., Humm, B., Krajewski, A., Zender, A.: Towards Improved User Experience for Artificial Intelligence Systems (2023). https://doi.org/10.1007/978-3-031-34204-2_4
Braun, V., Clarke, V.: Using thematic analysis in psychology. Qual. Res. Psychol. 3, 77–101 (2006). https://doi.org/10.1191/1478088706qp063oa
Braun, V., Clarke, V.: Reflecting on reflexive thematic analysis. Qual. Res. Sport Exerc. Health 11(4), 589–597 (2019). https://doi.org/10.1080/2159676X.2019.1628806
Brooke, J.: SUS: a quick and dirty usability scale. Usability Eval. Ind. 189, 4–7 (1995)
Google Scholar
Craig, S.D., et al.: The impact of a technology-based mathematics after-school program using ALEKS on student’s knowledge and behaviors. Comput. Educ. 68, 495–504 (2013). https://doi.org/10.1016/j.compedu.2013.06.010, https://www.sciencedirect.com/science/article/pii/S0360131513001619
Crompton, H., Burke, D.: Artificial intelligence in higher education: the state of the field. Int. J. Educ. Technol. High. Educ. 20, 22 (2023). https://doi.org/10.1186/s41239-023-00392-8
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2019). https://doi.org/10.18653/v1/n19-1423, https://aclanthology.org/N19-1423/, https://www.wikidata.org/entity/Q57267388, https://orkg.org/resource/R12209
Kim, N.J., Kim, M.K.: Teacher’s perceptions of using an artificial intelligence-based educational tool for scientific writing. Front. Educ. 7, 755914 (2022). https://doi.org/10.3389/feduc.2022.755914, https://www.frontiersin.org/articles/10.3389/feduc.2022.755914
Kumar, V., Boulanger, D.: Explainable automated essay scoring: deep learning really has pedagogical value. Front. Educ. 5, 572367 (2020). https://doi.org/10.3389/feduc.2020.572367, https://www.frontiersin.org/article/10.3389/feduc.2020.572367
Lamba, S., Saini, P., Kukreja, V., Sharma, B.: Role of mathematics in machine learning. SSRN Electron. J. (2021). https://doi.org/10.2139/ssrn.3833931
Article Google Scholar
Long, D., Magerko, B.: What is AI literacy? Competencies and design considerations, pp. 1–16. Association for Computing Machinery (2020). https://doi.org/10.1145/3313831.3376727
Luckin, R., Holmes, W.: Intelligence unleashed: an argument for AI in education (2016)
Google Scholar
Nunes, A., Cordeiro, C., Limpo, T., Castro, S.L.: Effectiveness of automated writing evaluation systems in school settings: a systematic review of studies from 2000 to 2020. J. Comput. Assist. Learn. 38(2), 599–620 (2022). https://doi.org/10.1111/jcal.12635
Parra-Santos, T., Molina-Jordá, J.M., Casanova-Pastor, G., Maiorano-Lauria, L.P.: Gamification for formative assessment in the framework of engineering learning (2018). https://doi.org/10.1145/3284179.3284193, https://doi-org.ezproxy.lib.vt.edu/10.1145/3284179.3284193
Petch, J., Di, S., Nelson, W.: Opening the black box: the promise and limitations of explainable machine learning in cardiology. Can. J. Cardiol. 38(2), 204–213 (2022). https://doi.org/10.1016/j.cjca.2021.09.004, https://www.sciencedirect.com/science/article/pii/S0828282X21007030
Rader, E., Cotter, K., Cho, J.: Explanations as mechanisms for supporting algorithmic transparency (2018). https://doi.org/10.1145/3173574.3173677
Sauro, J.: Measuring usability with the system usability scale (SUS) (2011). https://measuringu.com/sus/
Schmidt, A., Giannotti, F., Mackay, W., Shneiderman, B., Väänänen, K.: Artificial Intelligence for Humankind: A Panel on How to Create Truly Interactive and Human-Centered AI for the Benefit of Individuals and Society (2021)
Google Scholar
Semire, D.: An overview of automated scoring of essays. J. Technol. Learn. Assess. 5(1) (2006). https://ejournals.bc.edu/index.php/jtla/article/view/1640
Wang, H., Ma, C., Zhou, L.: A brief review of machine learning and its application. In: 2009 International Conference on Information Engineering and Computer Science, pp. 1–4 (2009). https://doi.org/10.1109/ICIECS.2009.5362936
Wilson, J., et al.: Predictors of middle school students’ perceptions of automated writing evaluation. Comput. Educ. 211, 104985 (2024). https://doi.org/10.1016/j.compedu.2023.104985, https://www.sciencedirect.com/science/article/pii/S0360131523002622
Zawacki-Richter, O., Marín, V., Bond, M., Gouverneur, F.: Systematic review of research on artificial intelligence applications in higher education-where are the educators? Int. J. Educ. Technol. High. Educ. 16, 1–27 (2019). https://doi.org/10.1186/s41239-019-0171-0

Download references

Author information

Authors and Affiliations

Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
Erin Hall, Mohammed Seyam & Daniel Dunlap

Authors

Erin Hall
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Seyam
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Dunlap
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Mohammed Seyam or Daniel Dunlap .

Editor information

Editors and Affiliations

Dept of Multimedia & Graphic Art, Cyprus University of Technology, Limassol, Cyprus
Panayiotis Zaphiris
Dept of Multimedia and Graphic Arts, Cyprus University of Technology, Limassol, Cyprus
Andri Ioannou

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hall, E., Seyam, M., Dunlap, D. (2024). Exploring Explainability and Transparency in Automated Essay Scoring Systems: A User-Centered Evaluation. In: Zaphiris, P., Ioannou, A. (eds) Learning and Collaboration Technologies. HCII 2024. Lecture Notes in Computer Science, vol 14724. Springer, Cham. https://doi.org/10.1007/978-3-031-61691-4_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-61691-4_18
Published: 01 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-61690-7
Online ISBN: 978-3-031-61691-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exploring Explainability and Transparency in Automated Essay Scoring Systems: A User-Centered Evaluation