Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3613904.3642198acmconferencesArticle/Chapter ViewFull TextPublication PageschiConference Proceedingsconference-collections
research-article
Open access

Teaching artificial intelligence in extracurricular contexts through narrative-based learnersourcing

Published: 11 May 2024 Publication History

Abstract

Collaborative technology provides powerful opportunities to engage young people in active learning experiences that are inclusive, immersive, and personally meaningful. In particular, interactive narratives have proven to be effective scaffolds for learning, and learnersourcing has emerged as a promising student-driven approach to enable personalized education and quality control at-scale. We introduce the first synthesis of these ideas in the context of teaching artificial intelligence (AI), which is now seen as a critical component of 21st-century education. Specifically, we explore the design of a narrative-based learnersourcing platform where engagement is centered around a learner-made choose-your-own-adventure story. In grounding our approach, we draw from pedagogical literature, digital storytelling, and recent work on learnersourcing. We report on our iterative, learner-centered design process as well as our study findings that demonstrate the platform’s positive effects on knowledge gains, interest in AI concepts, and the overall user experience of narrative-based learnersourcing technology.

1 Introduction

As we enter the “age of AI”, it is imperative to foster young people’s AI literacy by enhancing both their knowledge and skills with respect to emerging technology. By building awareness and understanding of AI, people are empowered to capably and responsibly navigate an AI-infused world, including as part of critically appraising such technology [76]. Such competencies also promote one’s ability to use and collaborate with AI in both professional and personal contexts [74]. For example, exposure to computational concepts in high school helps increase later interest in the field, including among students from traditionally underrepresented groups [43, 46, 77]. It also builds students’ readiness to leverage AI in future educational and professional careers [49] in numerous industries that are expected to have high demands for AI competencies that will exceed the workforce supply [19, 102].
Discussion is therefore growing around the need for AI educational resources [74, 122], and various programs are being launched (e.g., the AI Center for Excellence1 and MIT’s RAISE initiative for Responsible AI for Social Empowerment and Education2). However, the overall reach of AI-learning efforts is still rather limited [140]. In practice, it is challenging to integrate AI subject matter into existing K-12 settings for a variety of reasons, including a lack of instructor understanding and comfort level in teaching AI as well as a generally standardized and rigid school curriculum [116].
Extracurricular programs therefore provide a compelling opportunity to engage broad learner populations, especially in contexts where traditional classroom options may be limited or lacking. For example, our research partner, the TUMO Center for Creative Technologies3 is a free extracurricular program for 12–18 year olds that provides interdisciplinary educational opportunities around the globe, such as programs on emerging design, computing, and technology topics. In this paper, we collaborate with TUMO to investigate: How might we effectively teach AI concepts to students through extracurricular opportunities that support a variety of learning setups, diverse learner backgrounds, and variable levels of baseline familiarity with AI? In particular, we are keen to explore strategies that empower distributed extracurricular facilitators to deliver quality learning experiences that otherwise may not be feasible for them to offer more independently.
We specifically pursue an approach based on learnersourcing. “Learnersourcing” was originally coined by Kim [58] to describe a practice of student learning through the collaborative production of shared learning resources. The term learnersourcing is meant to evoke the related process of crowdsourcing, in which a task is split into smaller tasks that are distributed and completed by a pool of crowd workers [56]. Learnersourcing does differ from crowdsourcing in a few key ways, namely its motivation and incentive structure. Whereas crowdsourcing conceptualizes participants as "workers" (typically rewarded through paid compensation), learnersourcing is pedagogically motivated and emphasizes that the learning benefits of generating and engaging with useful content can be intrinsically meaningful [113, 132]. Learnersourcing can lead to formation of a learner community, though this is not a requirement nor always an outcome of a learnersourcing system [113].
To scaffold the learnersourcing process, we focus on narrative-based activities. Narratives have shown promise as a learning vehicle by promoting student interest and engagement [45], including in the context of STEM [107] and more specifically computing education, even in very young learners [29, 30, 87]. Using the mechanics of storytelling, abstract concepts can be explained through familiar and intuitive metaphors [69], which makes the ideas more accessible, promotes self-regulated learning [94], and overall creates an immersive and engaging experience [78, 79].
Our proposed strategy is both significant and novel, specifically in that it addresses a critical need for supporting learner interest and engagement in learnersourcing tasks. The importance of student interest and self-efficacy has been well documented in traditional classroom contexts [3]. Singh et al.’s 2022 review and classification of learnersourcing systems affirms that these psychological ingredients also impact learning outcomes in learnersourcing contexts [113]. In particular, the authors demonstrate that low self-confidence and low interest in tasks are top contributors to a lack of engagement, thereby highlighting that enhancing student motivation is essential for the viability of a learnersourcing system.
As we will discuss, narrative-based learning activities have been shown to address precisely such challenges related to interest and engagement; however, research has not yet explored their application in learnersourcing given the relatively nascent stage of such systems. In this way, our work responds to calls for additional research on learnersourcing, such as from Khosravi et al. who point out that "fundamental work" is still needed for learnersourcing systems to reach their full potential and be ready for large-scale adoption, including to leverage new opportunities for human-AI partnerships in content creation and evaluation [57].
Using a narrative-based learnersourcing approach, we develop a technology platform that enables a virtuous cycle of content creation, consumption, and peer feedback, with learners iteratively moving up the ranks and taking on increasingly more sophisticated roles that help sustain the educational ecosystem. Our formative design phases involved three cohorts of teen learners who helped us refine the narrative-based learnersourcing approach and explore its feasibility and efficacy across common extracurricular scenarios. Building on those insights, we report on our resulting system that instantiates narrative-based learnersourcing as well as our deployment study to evaluate the approach in terms of outcomes related to knowledge gains, attitudinal shifts, and overall user experience.
Specifically, we aim to examine the following research questions:
RQ1: Can a narrative-based learnersourcing platform increase knowledge of AI concepts?
RQ2: Can a narrative-based learnersourcing platform promote interest and engagement with learning AI concepts in an extracurricular environment?
RQ3: What is the user experience of a narrative-based learnersourcing platform?

2 Background and Related Work

In the following subsections, we review foundational pedagogical frameworks for teaching AI literacy, as well as related literature on the power of digital storytelling and learnersourcing as specific vehicles for learning.

2.1 Pedagogical groundings of AI learning

With the increasing significance of AI in both professional environments and daily routines, being knowledgeable about AI has come to be framed in terms of "literacy" [74], as students will need to be literate in AI skills to thrive in a future where such technology is prevalent [129]. Research interest in AI literacy education was limited before 2016, but the number of publications on this topic has recently surged [85]. This rise in attention, as well as the formation of major collaborations such as UNESCO’s Artificial Intelligence and the Futures of Learning Project, mark a recognition that AI technology will increasingly permeate the fabric of everyday life and that the public, including young members of society, require knowledge around fundamental AI concepts.
Various theoretical and pedagogical frameworks have emerged to guide educators in developing and delivering AI literacy curricula. In our research, we build on Ng et. al’s comprehensive review [86]. In particular, Bloom’s taxonomy is used for designing learning objectives, activities, and assessments [14]; and Technical, Pedagogical, and Content Knowledge (TPACK) describes how those forms of knowledge enable the successful integration of technology into teaching [47].
Beyond these two AI-specific frameworks, a myriad of other pedagogical approaches further guided us, both prominent theories (e.g., constructivism [6, 41], experiential learning [62, 136], project-based learning [15, 61], problem-based learning [110], applied learning [64], community-based learning [9, 39], and active learning theory [16, 88]) as well as conceptualizations of AI as a 21st-century skill [86, 123, 124] and models that are less commonly known yet still pertinent (e.g., rhizomatic learning theory [17]).

2.1.1 Bloom’s taxonomy adapted to AI literacy.

Bloom’s taxonomy, introduced by Benjamin Bloom in 1956, was intended to help educators design a balanced curriculum, learning objectives, and student assessments according to multiple dimensions in the cognitive, psychomotor, and affective domains [1, 14]. Though often misinterpreted as overly hierarchical, the taxonomy’s "levels" of learning are non-linear and highly integrated [13, 23]. Overall, it is generally viewed as a useful framework for educators seeking to craft a dynamic, deep, and adaptive learning experience by interweaving different types of learning and higher level cognitive skills through a variety of tasks [1].
Recently, in AI literacy in K-16 classrooms [86], Ng et al. proposed narrowing Bloom’s original six levels of cognitive learning (knowledge, comprehension, application, analysis, synthesis, and evaluation) to four AI-specific levels: know and understand AI, use AI, evaluate and apply AI, and AI ethics. Know and comprehend AI relates to building basic AI knowledge and comprehension. Use AI works to transfer AI concepts from theory to practice. Evaluate and apply AI builds critical thinking skills for analyzing and solving problems using AI, which typically requires synthesising prior knowledge. Finally, AI ethics evaluates the implications of AI systems in the real world, considering a diversity of perspectives. As with the original taxonomy, the motivation is to ensure AI curricula devote time for student activities that focus on higher-level thinking, such as AI ethics. Further, Ng et al.’s consolidated taxonomy provides useful specificity and practical value (e.g., concrete guidance for supporting problem-based learning) for AI educators.

2.1.2 TPACK adapted to AI literacy.

The technological, pedagogical and content knowledge (TPACK) framework is another highly influential tool for educators planning curricula [60]. Whereas Bloom’s taxonomy focuses on categorizing and structuring educational objectives, activities, and assessments around cognitive processes [1], TPACK focuses on the integration of technological, pedagogical, and content knowledge for effective teaching, emphasizing the complex relationships that exist among these components [12, 60]. Importantly, TPACK incorporates students’ learning context, in particular the context-dependent role of technology [106].
Ng et al. additionally provide an adaptation of TPACK for AI literacy educators [86] that organizes influential literature (e.g., Touretzky et al.’s "Five Big Ideas in AI" [122]) along TPACK’s three dimensions. The adaptation is designed for practitioners, offering example learning activities and guidance when creating interfaces for learning technology [86]. As described in section 3, we therefore structure learner activities and design our user interface based on considerations from these AI-adapted frameworks. This adaptation of TPACK also provides much of our AI literacy curriculum, including educational resources, assessments, and instructional delivery methods. We specifically utilize the AI for All (AI4ALL) Open Learning Curriculum [2], which applies the adapted framework to meet standards from NGSS Engineering4, ISTE5, Common Core ELA/Literacy6, and CSTA7.

2.2 Digital storytelling for education

As the ability to understand and apply AI topics is increasingly seen as an essential skill [97], inclusive forms of technical education are needed to broadly support AI learning and promote educational equity [31, 40]. For instance, creative and playful learning activities can encourage diverse learners’ early interest in AI concepts, as highlighted by recent work on designing inclusive educational technologies for AI learning [34]. Other research has demonstrated opportunities to support AI learning beyond traditional classroom settings, including in informal learning spaces [73], via online tools8, and in family settings [33, 35].
Narrative-based approaches are applicable to all such contexts (e.g., storytelling and other story-based instruction), as stories are a versatile and effective pedagogical tool [65, 119] including when integrated with technology (e.g., digital storytelling) [100, 101, 108], for students from backgrounds that face educational inequities [20, 71], and in the context of computing education specifically [63, 120]. For example, recent research has shown that learners are able to grasp computing concepts and create engaging stories through voice-based interfaces [29] as well as comic-based [117] and multimodal (voice and visual) programming environments [30]. Particularly pertinent work indicates that digital story writing can promote AI literacy, including students’ ability to move beyond levels of knowing and understanding to actually using and applying AI knowledge to solve real-life problems [87].
Storytelling is recognized as one of the earliest methods of teaching, in large part because it serves as an instrument for distilling complex information into more approachable, familiar, and engaging formats [93]. Today, technology further amplifies the immersive nature of stories through interactive audio, video, and images [104]. Digital storytelling has therefore been utilized across educational domains to enhance student motivation, engagement, and ultimate knowledge retention [36, 109, 139].
Narrative creation also enables knowledge sharing [138]. By using story-based approaches, learners can leverage their past personal experiences, thereby creating more active participation in the learning process and enhancing learning outcomes [115]. In this way, digital storytelling is often considered an excellent strategy to promote a culturally inclusive environment, by empowering participation from traditionally marginalized voices [82]. In addition, digital storytelling is often a more intuitive way for students to approach abstract concepts, and it also positively shapes learner attitudes around the educational topic at hand [18].
By emphasizing creative skills while simultaneously supporting systematic learning processes, digital storytelling further facilitates imagination, idea generation, information gathering and organization, active learning, self-expression, and problem-solving [36, 137]. In these ways, digital storytelling is a natural vehicle for acquiring and cultivating other 21st-century skills in addition to AI learning, namely creativity, critical thinking, open-ended problem-solving, communication, and leadership [28, 98, 118].

2.3 Learnersourcing

Learnersouring is related to crowdsourcing, a technique that engages groups of (typically non-expert) human workers to complete tasks [58]. Crowdsourcing is often done by unpacking tasks into distributable, smaller "micro" tasks that have self-contained context and can be completed by workers in an asynchronous and remote fashion [38]. Crowdsourcing has seen adoption across disciplines [130], including complex and creative tasks like article writing, decision making, and science journalism [59], as part of the Wikipedia project [127], and in open source software efforts [90].
In 2015, Kim and colleagues used the term "learnersourcing" to describe a practice of student learning through collaborative generation of shared learning resources [58]. A key distinction from crowdsourcing is that learnersourcing is focused on promoting learner-centered needs [132], rather than leveraging users as a form of labor to meet task requesters’ needs. That is, learnersourcing users participate by engaging in activities that not only collectively generate teaching content for other users but also impart learning on a personal level [113]. In this model, the crowd of learners both contributes to as well as benefits from created content, resulting in a virtuous learning cycle [58].
Specifically, learnersourcing has been shown to increase learning gains, not only for those individuals who consume crowd-created content but also for those creators of educational content [56, 113]. Recent work has focused on evaluating the quality of such student-generated content [26] as well as leveraging the method for annotation tasks [8] and to overcome experts’ blind spots when developing content for novices [48].
To generate content, learners are prompted to gain an understanding of the subject matter before creating educational artifacts for others, making content creation in a learnersourcing system similar to project-based learning [4]. Artifact creation involves a process of self-explanation, which has been shown to lead to effective learning [11]. Moreover, content creators actively engage with the learning material as they produce these artifacts, which results in better recall [25] and involves cognitive activities associated with higher-level learning according to Bloom’s taxonomy. Additionally, learnersourcing can foster a community where members connect around shared learning goals [113]. Further, such community can provide mentorship and role models while motivating learners to participate and contribute for the good of the collective [58].
Recent work involving learnersourcing emphasizes the growing possibilities for student, educator, and machine partnerships [57], with ChatGPT emerging as a focal point of student-AI collaboration [7, 72, 81, 92, 114, 121]. Among other considerations, these works explore how ChatGPT can be used to achieve higher quality learner-made content. We leverage these insights by taking a human-AI conversational approach to content creation. In our formative case studies (see section 3.3), we also heard from learners that ChatGPT helped to make content creation more enjoyable.

3 System Description

Tying together these pedagogical frameworks and learnersourcing principles, we begin by offering a conceptual model that applies Ng et al.’s formulation of Bloom’s cognitive taxonomy to the context of narrative-based learnersourcing. Building on this model, we then detail how our platform scaffolds a user’s progression through gamified levels of AI learning. Finally, we instantiate these ideas in a functional narrative-based learnersourcing system co-designed and evaluated with learners from the demographic groups that TUMO typically serves.

3.1 Pedagogically-grounded design strategy

Building off Ng et al.’s formulation of Bloom for AI literacy [85], our approach further adapts that model to satisfy the design requirements of a narrative-based learnersourcing system. Specifically, Ng et al.’s groundwork is nicely compatible with a learnersourcing system given (a) it divides engagement into discrete and measurable learning activities that can be delivered to users as learnersourcing "tasks" and (b) it provides a widely accepted assessment scheme that we can use to measure learning. To reiterate, the levels in Bloom’s and Ng et al.’s frameworks are integrated rather than overly hierarchical [5], and we take the same approach in our system’s design.
Figure 1:
Figure 1: A visualization of important activities in our narrative-based learnersourcing system and how these are associated with different cognitive learning levels. The four activities (know and understand AI, use AI, evaluate and apply AI, AI ethics) are adapted from Ng et al.’s formulation of Bloom’s cognitive taxonomy for AI literacy [86]. The figure’s visual design is based on the revised edition of Bloom’s taxonomy [5]; in particular, we avoid the stacked pyramid depiction, which can inappropriately suggest an overly hierarchical and linear progression through Bloom’s levels [23].
Figure 1 illustrates our pedagogical model, designed around learners completing learnersourcing tasks that require conceptual and technical AI problem-solving. Each task involves multiple choice assessments and can also support peer evaluation. Completing a level requires successfully completing a set of tasks that meet specific educational criteria. As learners progress, tasks become more complex and progressively emphasize higher level cognition.
Figure 2:
Figure 2: A learner-centric perspective of the narrative-based learnersourcing journey. The diagram is read bottom to top to follow the learner’s journey through the three roles (explorer, architect, facilitator), reading the learnersourcing tasks associated with each role from left to right. System components are shown in relation to the specific learning activities and roles they support. Sub-tasks that emphasize real-world engagement are highlighted with dashed borders.
We design for a creative, self-directed learning experience that leverages repeated exposure to AI concepts to promote deeper knowledge comprehension through ongoing story engagement. As learners cycle through levels, they are initially encouraged to simply experiment with activities related to the higher levels, but progressively they come to spend considerable time on more advanced content. Our top level ("facilitate learnersourcing") adapts Bloom’s original top level ("create") to serve as a learner-enforced check on the real-world relevance of learner-created content across the platform. In this way, our narrative-learnersourcing approach aims to prepare learners for engaging with AI in real life.
Learner levels guide users to reach key curriculum elements by representing that content and associated learnersourcing tasks as components of a game. Upon reaching a higher level, the game objectives change and the learner is rewarded by unlocking access to interfaces that afford increasing rights and roles (e.g., making implementation-level changes to the system), similar to the collaborative model of responsibility found on platforms like Wikipedia.
Figure 3:
Figure 3: Story explorer UIs: Login (top left), progress (top right), graph view (bottom left), adventure view (bottom right).

3.2 Learner roles and user journey

As learners (interchangeably called users) level up, they become increasingly responsible for building out the system’s content and capabilities. Thus, learnersourcing tasks and learner levels are mechanisms for structuring engagement to both deepen a user’s learning and support the scalability of the system as a whole. Figure 2 overviews the learner experience, mapping Khosravi et al.’s four core functions in a learnersourcing system (utilize, create, oversight, evaluate) [56] to learner roles. Specifically, each role is defined by a focus on one component of our system: a story adventure component (role = story explorer, focused on "utilize"), a story graph (role = story architect, focused on "create" and "evaluate"), and story infrastructure (role = system facilitator, focused on "oversight"). Appendix section A.2 describes these components further.
Even though the system provides guided sequences of sub-tasks, an individual’s actual learning journey is uniquely personal, steered by narrative preferences, prior technical capabilities, and personal learning objectives. Further, while we expect users to engage in activities they enjoy (e.g., extending a particular plot line in the narrative or implementing mini-game features in code), the system does nudge users to work toward leveling up. For example, after passing level 2, prompts begin to encourage adding a mini-game. The system allows for the involvement of more advanced facilitators at each stage and is designed to promote their engagement and interest so that they continue to provide oversight and contribute to the longevity of the platform.
A learner’s progression from explorer, to architect, to facilitator indicates an increasing demonstration of conceptual, technical, and real-world-applicable AI literacy. The following subsections further describe the various roles and associated learning tasks.

3.2.1 Story explorer role.

As a story explorer, a learner works to know and understand AI by engaging with existing story content.
“Start”: New learner onboarding. A user’s first experience on the platform is opening a web URL and landing on the main splash screen (Figure 3, top left image). After creating an account, the user enters into the onboarding story scene. This scene explains the interface’s controls, and its completion establishes that the user understands and can use the UI. She then transitions to scenes made by other learners, and the story’s narratives begin to unfold.
Interacting with a story scene. The learner receives story content on the web app in the form of text, interactive media, and links. After delivering such content, the app waits for the user to enter free response text or select among hard coded choices to continue the action of the scene. We designed the interface to feel familiar to text messaging UIs like Facebook Messenger or Telegram. Story history is preserved as an infinite scroll. Each scene uses an everyday metaphor to explain a topic in AI (e.g., relating training an AI model to instructing a driver during a taxi ride). These metaphors, as well as the setting and other narrative details that give consistency to the story experience, are generally drawn from a specific real-world place and culture: Yerevan, Armenia. Once the learner has completed a scene, she can revisit it at any time using an interactive graph-like representation of the scenes and their connections.
Engaging in guided real-world experiences. As the action in the story scene unfolds, the narrative links these events to a task that requires the learner to engage in activities off-platform, either on the internet or in the physical world. For example, one learning task involving training an image classifier involves a user taking photographs in real life.
Figure 4:
Figure 4: Story architect UIs: Story creation and editing (top left), code editor (top right), story scene content graph (bottom left), story wiki (bottom right).
Interacting with embedded mini-games. Within the context of a scene, the learner can enter into an embedded mini-game. Mini-games support critical thinking and more active engagement with an AI topic. One example on our platform is a variation of "20 questions" that illustrates how the k-nearest neighbors algorithm classifies a data point’s grouping. The learners who collaborated to contribute this mini-game (in the story architect role, described in section 3.2.2) implemented the underlying algorithm using facilitator-provided pseudocode as a reference, and they integrated a narrative backdrop for the mini-game into the flow of the broader story happening in that scene.
Post-scene assessment. At the end of every scene, the learner’s comprehension of the AI topic is tested via a multiple choice assessment embedded in the story content. The assessment questions (see Appendix section A.3) employ the same extended metaphor as the scene. A learner must correctly answer all questions to progress. Learner responses are logged and used to compute scene-level statistics that are accessible to facilitators as part of system monitoring.
Post-scene feedback. Finally, the learner is presented with a feedback form (see Appendix section A.6 for examples) for self-reporting enjoyment and perceived learning during the scene.

3.2.2 Story architect role.

As a story architect, a learner focuses on using AI to create new content as well as evaluating and applying AI to synthesize prior knowledge, solve problems, and build critical thinking skills. Figure 4 illustrates UIs used by story architects.
Choosing a "need ticket" to work on. Borrowing from agile development methods, our system uses ticketing to define, organize, and prioritize learnersourced contributions. Each ticket describes a "need" in the story for a scene contribution. For instance, a particular AI topic may require additional explanation if existing content is lacking. After a user makes a selection from a set of available tickets, the system directs the learner through the process of inserting a new story scene or modifying an existing one, as appropriate.
Completing a story preparation template. To prepare for story writing, the system provides a Creator Guidebook (in the form of a Google Doc, which we created and that facilitators maintain) of other story scenes and external lecture videos, code examples, and quizzes to help the learner sufficiently master the AI topic her contribution will teach. The guidebook walks the learner through a process of “story-sourcing” to identify real-world narratives that can serve as compelling and relatable teaching metaphors. The guidebook also offers storytelling exercises for architecting an interesting and engaging narrative. To create an outline of her story contribution and embedded quiz questions, the learner follows a story scene template provided by the guidebook. If the learner has questions at any point, she can use a dedicated Discord server to get help from other learners on the platform.
Figure 5:
Figure 5: Facilitator UIs: Scene feedback (top left), system code (top right), guidebook (bottom left), pull requests (bottom right).
Writing a scene with AI collaboration. The learner next follows a procedure described in the guidebook for converting the completed template into ChatGPT prompts, then reviews, edits, and incorporates the ChatGPT responses into an outline, resulting in a more polished scene script. The learner then integrates the script into the “story wiki," a shared staging environment used to test story transitions and wordsmith. Once the text and media are ready, the learner converts the scene into a TypeScript9 file using a provided wrapper library and guidebook documentation. She then integrates this file (representing the scene) into the server following a linking process outlined in the guidebook. Finally, the learner runs the backend code locally and tests her contribution.
Embedding a real-world activity or mini-game. To complete level 3, the learner must either embed a mini-game or design and incorporate an activity that involves real-world engagement. The guidebook walks through this process. Changes are marked with a placeholder in the wiki and then, after transitioning the wiki to code, added directly as TypeScript code. To complete level 4, the learner must self-educate herself on an ethical issue related to the AI topic she is working on and create a level 3 type contribution that demonstrates and engages with the issue. The guidebook offers direction by pointing to a variety of information sources, including news outlets (e.g., Forbes and New York Times) and works by prominent scholars in the field (e.g., System Error [131]).
Passing a code review and merging changes into production. After testing code locally, the learner creates a pull request. Using a built-in messaging service on the platform, she requests that a higher-level learner review the changes. Once reviewed and accepted, the pull request is merged into the production environment.

3.2.3 System facilitator role.

As a facilitator, a learner focuses on system oversight and gains increasing responsibility in maintaining and scaling the platform. Figure 5 illustrates UIs used by facilitators.
Identifying user experiences and needs on the platform. When acting as a facilitator, the learner monitors the community’s engagement with existing content using the analytics dashboard, which includes scene-level statistics such as completion rates, comprehension assessment scores, engagement scores, and anonymized feedback. The learner also monitors and responds to help requests, bug alerts, architect requests, and general complaints on the Discord server. She additionally engages with the story content in a targeted way, both to hunt for issues (e.g., story gaps in need of content) and to uncover moments of delight [83]. Synthesizing the signals from these different channels, she accumulates her own set of notes describing user needs on the platform.
Preparing learning resources for story architects. Once a facilitator is better aware of issues with story content, she assesses the status of tools and support available to story architects. Specifically, she considers flagged issues in the guidebook, lectures, and platform features, which she addresses by preparing learning resources for architects. The guidebook describes all these procedures.
Synthesizing needfinding results into "need tickets". The learner discusses her needfinding results with other facilitators and helps synthesize findings to compile a set of tickets, which each define a need that should be addressed by an architect. The learner incorporates these into the existing set of open tickets.
Adapting learning assessments from established curriculum. If a need exists for resources for creating post-scene assessments, the learner pulls questions from available trusted online teaching materials (e.g., AI4ALL), modifying them as appropriate.
Creating features, fixing bugs, and performing code reviews. Our platform’s codebase is available for all learners to view and modify via pull request. When a facilitator identifies a need for a new feature or a bug fix in the underlying system infrastructure (e.g., the web app, server, database, analytics and tracking, or dashboard), she can create and submit a pull request for review by the codebase owners (in our case, this is the research team, though other administrators are possible, such as extracurricular coordinators or even learners that eventually advance to principal admin status). Facilitators also are responsible for reviewing, commenting on, and accepting or rejecting pull requests.

3.3 Formative design phases

It is important to note that we did not create our learnersourcing system through a top-down development process. Rather, we engaged in multiple rounds of case study workshops to understand learner needs, build out and iteratively test functionality, and generally understand how the system might be utilized by diverse learners in a variety of extracurricular scenarios.
All participants had little to no prior coding experience and were recruited by TUMO, which helped with coordination and provided supporting resources. Here we briefly overview each of these preliminary studies and key design takeaways. Elaborated descriptions are provided in Appendix section A.1.

3.3.1 Case study 1 (C1).

Our first case study focused on exploring learnersourced creation of narratives, specifically to understand how we might effectively provide scaffolds to guide participants with little coding experience through creating narrative-based content that convey AI concepts. C1 took place as an online workshop with 18 learners located in Yerevan, Armenia who ranged in age from 12–17 years old (9 female, 9 male).
The workshop began with an overview of narrative structure, followed by participants developing stories incorporating AI concepts. Our initial system prototype allowed participants to implement interactive functionality for their narratives. The study revealed that participants benefited from creating narrative-based content in stages and were enthusiastic about using stories for learning. Notably, narratives that blended fiction and personal experiences, especially those featuring local contexts, were perceived as the most interesting. Participants did struggle with generating narratives from scratch, indicating the value of storytelling prompts encouraging learners to draw on everyday experiences.

3.3.2 Case study 2 (C2).

Our second case study focused on supporting collaboration and how emphasizing the integration of personally and culturally significant narrative elements might address learners’ difficulties in generating story ideas. C2 took place as an in-person workshop with 31 learners located in Yerevan, Armenia who ranged in age from 15–19 years old (21 female, 10 male).
Today, Armenia remains in a precarious national position in the wake of the Second Nagorno-Karabakh War. We observed that this point of national identity was a very important backdrop to a storytelling workshop experience for many students. To create space for these unique and meaningful Armenian stories to enter the project, we encouraged participants to seek “stories of joy” as part of the prompts we gave for sourcing stories. Resulting stories ranged from memories of peaceful moments during the war, to a treasured birthday celebration with a grandparent, to recollections of a favorite song. As learners adapted these plot points into a larger narrative, we encouraged them to add details that would make the story feel as culturally rich as they desired. Some added their favorite music, some added photos of the locations they were describing, and some included specific cultural experiences.
Learners then merged their individual stories into a cohesive overarching narrative, which we described initially as a "rhizome" [17] of narrative possibilities. This organic composition of stories resulted in formation of a new setting, "Apricot Stone City", a fictionalized version of Yerevan, the capital city of Armenia. Learners devised this name and appreciated its multifaceted cultural significance. Specifically, the historic buildings of Yerevan are made of a pink volcanic rock called tuff10, which gives the city a general coloring similar to ripe apricots. Apricots are indeed a symbolic fruit to Armenians, deeply connected to folklore and a favorite treat. As an example of this cultural significance, Armenia was represented by the song “Apricot Stone” at Eurovision 2010. Further, inside an apricot, there is a pit (“stone”) that contains a seed that is edible and a favorite of Armenians. Our learners took this as a metaphor for the learning experience, where a sweet and delightful narrative contains hard lessons that eventually give way to a nutritious and fulfilling learning experience.
C2 also examined learner collaboration. To support coordination, we specifically developed project management tooling as needed, such as ticket tracking and the story graph component for visualizing scene connections. Moderating content became a key consideration too, with learners designated or self-selecting to be facilitators who performed such moderation. These learners flagged issues in content (e.g., age-inappropriate, inconsistent, and inaccurate language). Participants appreciated the gradual roll out of structured, systematic processes, and they demonstrated that these collaboration features could be utilized without inhibiting creativity.

3.3.3 Case study 3 (C3).

Our third case study focused on scalability and the potential for learnersourced narratives to be effective and engaging across different cultural contexts. C3 took place as an in-person workshop with 16 learners located in Berlin, Germany who ranged in age from 14–17 years old (8 female, 8 male).
Given the culturally-specific elements of the narratives learnersourced from C1 and C2, C3 explored how this content would be received by a learner population with different demographic characteristics, based in a different geographical location, who potentially may therefore share minimal cultural common ground with the original content creators. Encouragingly, high engagement levels and knowledge gains for C3 participants suggested that users from diverse cultural backgrounds could indeed form a dynamic learning community on our narrative-based learnersourcing platform.

3.4 Final technical implementation

After concluding our preliminary phase of case studies, we made a final batch of technical improvements to the platform to enhance various aspects of the user interface and to instrument the system with logging for our evaluation study. For instance, we improved graph view animations11, improved our custom web-app peer-to-peer messaging service, and integrated IdleTimer12 for detailed engagement tracking [32] through mouse, keystroke, and React DOM monitoring [67]. We performed rounds of internal bug testing and implemented fixes accordingly as part of finalizing the system and moving from the “prototype” to “production” phase of our design process. More fine-grained technical implementation details about the system can be found in Appendix section A.2.

4 Evaluation

Having built and refined the platform (named Apricot Stone City for reasons described in section 3.3.2) based on insights from our iterative design process, we next evaluated its ability to promote learning, engagement, and generally positive user experiences.

4.1 Participants and procedures

Participants were recruited through TUMO’s existing channels including email lists as well as through referrals from a former TUMO instructor. We also utilized snowball sampling [84], encouraging people to share our invite with interested acquaintances.
In designing for a target demographic, we focused primarily on high school aged learners, tailoring our AI curriculum and teaching strategies to this age group [86]. However, from the beginning we were sensitive to concerns from TUMO that, while a target demographic is important, their web-based extracurricular programs include popular offerings that are open access. They found that these are typically utilized by high school aged learners as well as (to a lesser extent) middle school aged learners and college aged students and above. Therefore to align with our partner’s vision of broadly accessible extracurricular educational offerings and to support inclusive participation among anyone aiming to gain AI literacy, we did not employ age-related exclusion criteria.
Our final sample consisted of N=27 participants representative of the learner community that TUMO serves. 14 participants were high school aged (14–18 years old), 11 were college aged (19–22 years old), and the remaining 2 participants were 23+ years old. All participants had at some point been affiliated with TUMO.
In addition to gaining access to the Apricot Stone City platform as described in section 3 and associated resources (e.g., videos, user manual), we also created a Discord chat group where learners could interact with each other for peer-based Q&A, collaborative coordination, and general social engagement. The study ran for 1 week, similar to case study 3, as we were interested in exploring the minimum viable timeframe by which positive effects could be observed with respect to knowledge gains, learner attitudes, and sense of community, among other outcome variables described next. The 1-week timeframe is also representative of many of the most popular extracurricular programs offered by TUMO. Procedures were reviewed by the IRB at Dartmouth College.

4.2 Data collection

4.2.1 Exams, quizzes, and other learning assessments.

We measured learning gains through a comprehensive multiple choice AI knowledge exam administered pre-study and post-study. These exams were isomorphic. Within the platform, post-scene AI knowledge assessments also served as a test of understanding for the AI topic covered in that specific scene. The exam and post-scene assessment questions were created based on the learning objectives and “Unpacked” sections of the AI4K12 framework [2]. The Appendix (see A.3) provides questions from exams and post-scene assessments.
We created the exam, while post-scene questions were created by learners in the architect and facilitator roles, using reference materials we created. During learner assessments and exams, question ordering and multiple choice response ordering were randomized to minimize order effect bias [66, 133]. To discourage random guessing, we included an "I don’t know" option [112] and emphasized that scores would be anonymized and that answering honestly would help us evaluate and improve our system. Pre-study and post-study exams included thirty questions (three questions for each of the ten AI topics covered13). Post-scene assessments had five questions.
Once learners progress to more advanced levels on the platform, they transition from consuming content to creating it, including quizzes. We therefore also assessed those learners’ knowledge based on our expert review of their created content, using a rubric with metrics for the correctness and the helpfulness of content. Specifically, our rubric is based on Denny et al.’s metrics for comparing the quality of teaching resources generated by an LLM with resources created by students as part of a learnersourcing activity [27].

4.2.2 Automatically tracked measures.

The platform logs a variety of data, including timestamps of all interactions, visits to platform pages and clicks on specific content, scene edits, and completion of quests and levels. Specifically, we calculate affective, behavioral, and cognitive markers of learner engagement [52] as follows.
Table 1:
GroupMeanMedianStd DevMinMaxS-W statisticS-W p-value
All learners38%37%23%0%97%0.96.35
All learners excluding outliers36%43%22%7%73%0.98.94
All female learners excluding outliers38%43%12%7%50%0.76.01
All male learners excluding outliers39%30%18%17%73%0.89.13
Table 1: Baseline AI knowledge, as measured by the pre-study AI knowledge exam.
Sessions: Session logs track active user engagement in milliseconds and sign in/sign out events. To ensure we are only recording engagement when the user is truly active, we use IdleTimer, which tracks mouse and key events. Session logs also record what component in the UI is being viewed (e.g., "adventure view," "story graph view," etc.)
Interaction with content: We record all story content (text, images) served to the user and all responses by the user.
Content changes: We log all content pushed by users and require users to specify the type of change being made as either a "bug fix" or a "feature." We also survey users to self-report the time spent working on content that they push.
Social interaction: We log all communication on the platform between users, using the on-platform messaging capabilities. We also log requests to the facilitators to fix bugs or review content as well as facilitator replies.
Attentiveness to content: We operationalize attentiveness as the amount of time spent on a given story scene, normalized (divided) by the amount of words in that scene.
Scenes and topics completed: A timestamped record of all the scenes and AI topics that a user has completed.
Progress through levels: A timestamped event record of each time a user advances to a new level.

4.2.3 Surveys and self-reported data.

At the beginning of the study, before administering the knowledge assessment, we gave participants a questionnaire to gather personal attitudes about AI-related self-efficacy and sense of belongingness in computing [68, 95, 128], readiness to learn [50, 53, 135], as well as subjective enjoyment regarding reading stories and writing stories. We also administered a task-value belief scale, given the perceived utility of a topic contributes to a learner’s interest in it [24, 70, 95].
A post-survey at the end of the study asked these same questions and also assessed user experience and perceived usability of the platform via the User Engagement Scale (UES-SF) [91] and the extended Unified Theory of Acceptance and Use of Technology (UTAUT2) [125]. We also asked participants to rate the perceived usefulness of specific levels and content in supporting their learning, asked for open-ended reactions to the story content, and asked about sense of belongingness in the Apricot Stone City learning community. Self-assessments delivered at the end of scenes and quests gathered additional self-reported data about self-perceived learning and emotional enjoyment of scenes as well as feedback about what users liked or would change about the scene. The Appendix (A.5, A.6) provides specific questions from these instruments.

4.2.4 Focus group interviews.

At the conclusion of the study, we conducted two focus groups (see our semi-structured interview guide in Appendix section A.7). Given we were interested in understanding the sense of community that participants perceived on the platform, we opted for focus group style interviews rather than individual interviews. Seven participants (3 female, 4 male) volunteered to take part in the interviews.
We explored each of our research questions by inviting participants to describe their perspectives on the favorite things they learned and what aspects of the platform experience facilitated or hampered their learning, how and why their interest and engagement in AI improved or diminished during the study, their likelihood of taking AI or computer science classes in the future, their sense of community on the platform, their reactions around the story-based scaffolding, and factors surrounding their intentions to continue or discontinue using the platform after the conclusion of the study.
Interviews were transcribed and qualitatively analyzed together with open-ended self-report data. Specifically, two authors used inductive coding to surface themes and insights, which we use to contextualize the quantitative results reported in the next section.

5 Results

In this section, we present results from our evaluation of the Apricot Stone City system. Main findings are framed around our research questions. We also describe other notable insights, including gender differences as well as results that speak to our platform’s scalability.

5.1 RQ1: Can a narrative-based learnersourcing platform increase knowledge of AI concepts?

Foremost, we are interested in examining whether a narrative-based learnersourcing approach can promote learning and increase knowledge of AI concepts.

5.1.1 Baseline knowledge levels.

First, we look at learners’ baseline knowledge of the AI topics covered on the platform (see Table 1). A Shapiro-Wilk (S-W) test [103] suggests the data is drawn from a normal distribution. No significant differences are seen in pre-assessment scores related to gender (Cohen’s d = 0.131, p = .775).

5.1.2 Changes in AI knowledge.

Between the pre-study and post-study AI knowledge assessments, we observe a significant mean increase of 24.2% (see Table 2). Regarding outliers, three learners scored exceptionally low on the pre-exam (<3.33%, which is 2 sigma below the mean), and two learners scored exceptionally well (>80.33%, which is 2 sigma above the mean). Highlighting and excluding outliers is useful to understand if aggregate results are skewed by the outcomes of learners who scored near zero or near perfect on the pre-assessment. For consistency, any "excluding outliers" notes (in Tables 1 and 2, as well as later results) refer to this same set of outlier participants.
Table 2:
GroupMean pre to post changeMedian pre to post changeCohen’s d
All learners+24.20% ***+23.33% ***1.08
All learners excluding outliers+23.18% ***+22.67% ***1.28
All female learners excluding outliers+29.67% ***+28.33% ***2.21
All male learners excluding outliers+17.78%+15.00%0.86
Table 2: Percentage change in AI knowledge, measured by performance on pre- to post-study isomorphic AI knowledge exams. (*p < .05, **p < .01, ***p < .001).
Table 3:
GroupMeanMedianStd DevMinMaxS-W statisticS-W p-value
All learners pre exam3.3032.80100.92.03
All learners post exam6.3372.871100.92.03
All learners pre exam excluding outliers3.1832.11080.96.45
All learners post exam excluding outliers6.276.52.762100.91.047
Table 3: Concept mastery, as measured by AI knowledge exams.
Running analyses without these outliers shows similar results (a significant mean increase of 23.18%). High scoring outliers did improve on average too (mean 91.67% on the post-assessment, compared to a mean of 90% on the pre-assessment). These results suggest that learners with a wide range of prior experience levels could benefit from participating in our learnersourcing platform. Further, we do not observe statistically significant differences when splitting participants by race, nationality, or age, suggesting our platform may support AI learning across diverse users.
To examine our assumption that content creation is reflective of knowledge gains, we next separate learners into two groups, those who created content and those who only consumed content. T-test comparison does show a significant difference (p = .03, Cohen’s d = 1.07), with the content creation group having a larger positive increase in knowledge (mean: 31%, median: 30%, std dev: 19%). Regarding attitudinal factors that may impact knowledge gains, we observe a significant correlation between pre-study self-efficacy and pre to post knowledge assessment scores (r = 0.56, p = .04).
In addition, we are curious to understand if learners could reliably self-assess their own learning. Comparing objectively tested and self-reported assessments of AI knowledge post-study, we see a significant negative correlation (r = -0.43, p = .045). This result suggests a potential Dunning-Kruger effect, whereby poor performers overestimate their knowledge and high performers underestimate their knowledge [75]. This phenomenon has similarly been observed in work on narrative-centered, game-based learning environments [89]. Such findings indicate the importance of objective, expert-made assessments to reliably gauge learning outcomes versus relying solely on learner self-reports.

5.1.3 Learning specific AI concepts.

To understand how well learners mastered specific concepts, we next group knowledge assessment questions by topic (e.g., Finding patterns in data, Structure of a neural network, etc. for the 10 AI topics the platform covers [2]). Considering a learner to have sufficient mastery of a topic if she could correctly answer at least two of its three assessment questions, we observe a significant increase in concept mastery from the pre to the post assessments (Cohen’s d = 1.09, p < .001) — specifically, a mean increase of 3.04 (median: 3.0, std dev: 2.72), with pre- and post-study concept mastery results shown in Table 3.
We also consider the connection between knowledge gains and progress through Bloom’s levels by conducting an ordered logistic regression between the difference in pre to post exams and the max level completed by a learner. We find a statistically significant relationship with a coefficient of 3.05 and p = .04. This relationship is to be expected, so confirming it adds to our confidence in the correctness of these two independent measures of learning.

5.2 RQ2: Can a narrative-based learnersourcing platform promote interest and engagement with learning AI concepts in an extracurricular environment?

While promoting knowledge gains is a central goal of our work, it is also important to understand learner interest and engagement during the experience, as both are core factors in sustaining participation in learning activities and the mastery of increasingly complex topics over time [51].

5.2.1 Interest in AI concepts.

Learner interest is a four-phase construct spanning triggered situational interest, maintained situational interest, emerging individual interest, and well-developed individual interest [51].
Early and ongoing interest is signaled by time spent on learning activities. On our platform, learners spent a cumulative 674.2 hours engaging with content. Of all available AI subtopics, learners spent the most time on the most advanced topics — specifically, neural networks (e.g., “Adjusting internal representations”, “Weight adjustment,” and “Structure of a neural network") — whereas introductory AI concepts (e.g., “Finding patterns in data”) saw the least amount of time. In post-study focus group interviews, learners said that they preferred to engage with the topics they were least knowledgeable about out of interest and curiosity.
Further, in moving from content consumers to content creators, we see that learners generally re-engaged with scenes they engaged with and rated favorably as consumers, extending or creating alternative narrative paths for scenes and quests that had originally captured their interest. All learners cumulatively spent the most time creating scenes for quests related to the advanced AI concept “Adjusting internal representations” (115.48 hours).

5.2.2 Engagement with the learning process.

In the context of education, engagement is commonly characterized in terms of three dimensions: affective, behavioral, and cognitive [42].
Table 4:
Baseline factorRegression coefficientp-value
Self-efficacy1.66<.001
Readiness to learn-1.14.046
Pre-study self assessment of CS and AI topic comprehension-0.12<.001
Table 4: Least squares regression to test how learner baseline factors (self-efficacy, readiness to learn, and a detailed self-assessment of AI comprehension) impact knowledge gains, as measured by pre to post exam scores.
Figure 6:
Figure 6: Learners who could successfully make the transition from explorer (level 1) to architect (level 2) tended to then successfully progress through the remainder of the learning journey, eventually becoming facilitators (level 4).
Affective aspects of engagement relate to a learner’s emotional reactions to the learning activities. The scene that received the highest emotional enjoyment rating (mean: 4.95/6, std dev: 0.92) is called “Carol, Ani, and Anoush have dinner together.” In this scene, a “foreign visitor” to Apricot Stone City (whose name is Carol) joins a local woman (Ani) and her elderly mother (Anoush) for dinner. Learners’ feedback consisted largely of positive comments about the cultural content of the scene — for instance: “Story content was really pleasing as it showcased some typical traditional Armenian cultural aspects” (P7) and “That’s a nice change of scene and a great relief after a lot of learning” (P26).
Our main behavioral indicator of engagement is attentiveness to content. Given our attentiveness findings demonstrate gender differences, we report them in section 5.4 with other such results.
Finally, cognitive engagement is operationalized by completion of learning activities. Specifically, Figure 6 illustrates the distribution of max levels reached by participants during the study period. It is important to note that moving from level 1 (story explorer) to level 2 (story architect) entails transitioning from content consumption to creation. The observed drop-off suggests that learners found this challenging. By comparison, the bump at level 4 (system facilitator) suggests that most learners who were able to successfully make the leap from consumers to creators were then motivated and capable enough to "go all the way". Further, many of these learners had little to no prior coding experience, suggesting our system is able to successfully transition novice learners into roles involving progressively greater concept mastery and platform responsibility.
For most participants, this study was their first time using software development tools like GitHub, VSCode, and React. Based on our analysis of interview data, feedback from the post-study survey, and observations of peer-based troubleshooting via the Discord server, we can see that getting up to speed with these new tools was a challenge for many learners and a source of pride for those who succeeded. Without such support, learners were at risk of stalling in their progress or even leaving the platform. Such findings highlight future opportunities for needfinding to understand other reasons for drop-off and strategies to minimize it.

5.2.3 Personal attitudes and beliefs of learners.

Self-efficacy relates to a person’s belief in her ability to master a task [68, 95, 128]; and readiness to learn refers to the behavioral, cognitive, and socio-emotional skills that indicate preparedness to receive instruction [80]. Given both are important factors in learning, we analyze how these personal attitudes may relate to the amount of interest and engagement participants demonstrated on our platform, as well as whether the learning experiences afforded by our platform can help to positively shape these attitudes.
We conducted a least squares regression to see how baseline levels of these variables affected knowledge gains, as summarized in Table 4. The significant findings suggest that learners with low attitudinal scores (who might therefore typically struggle in both traditional and online learning contexts [10, 22]) actually tended to learn more on our platform. We suspect that wrapping the AI concepts into approachable, familiar stories makes the learning process more accessible for such learners. Including an additional factor, personal interest in stories, results in a regression coefficient of 5.70 (p = .26). While not statistically significant, the high coefficient does suggest that narrative-based learnersourcing is more powerful for learners who are already interested in storytelling.
In addition, we see evidence that the learning experience on the platform can improve personal attitudes and beliefs over time. Specifically, analyzing changes from pre to post responses on attitudinal measures, we find that self-efficacy increased from a median of 10/30 to 21/30 (a 110% change). Readiness to learn remained stable at a median of 16/30, although we believe this is due to the nature of the questions in the online learning readiness instrument [135] we employed (e.g., "I am good at setting goals and deadlines for myself", "I am relatively good at using the computer", etc.), which may have been less salient or shiftable during this study experience.

5.3 RQ3: What is the user experience of a narrative-based learnersourcing platform?

To understand the user experience (UX) of our platform, we analyzed UX questionnaires, self-reported sense of community, and how participants reacted to narrative-based learning content.
Figure 7:
Figure 7: UTAUT2 assessment of user experience based on Performance Expectancy (PE), Effort Expectancy (EE), Social Influence (SI), Facilitating Conditions (FC), Hedonic Motivation (HM), Habit (HA), and Behavioral Intention (BI).

5.3.1 Semi-quantitative UX measures.

To measure how much participants felt absorbed when using the system and how rewarding they found those interactions to be, we used questions adapted from the User Experience Scale Short Form (UES-SF) [91], on a total scale of 3–15. Our analysis indicated a positive user experience on the whole (mean: 11.2, median: 11.0, std dev: 2.5, min: 6, max: 15).
The UTAUT2 [125] is another well-established instrument that helps evaluate additional UX dimensions. Figure 7 illustrates participants’ generally positive responses. We note that the social influence construct demonstrated the highest ratings, suggesting the platform’s ability to promote peer awareness and connection even within the study’s relatively short timeframe.

5.3.2 Sense of community.

Further examining social aspects of user experience, we additionally surveyed participants after the study to measure their sense of the community. On a scale of 3–9, results again point to a reasonably strong sense of community (mean: 6.6, median: 7.0, std dev: 1.8, min: 3, max: 9), though not overwhelmingly so. This can likely be attributed to the relatively short study timeframe, and we expect these community ties would deepen over longer periods of engagement on the platform.

5.3.3 Experience with the narratives.

As mentioned in section 5.2.2, reactions to the story content seemed to play a main role in participants’ emotional engagement with the system. Our inductive coding of qualitative data demonstrated that participants consistently felt most confident with AI concepts that were conveyed via analogies built into the story (as reported by P1, P2, P4, P5, P9, P11, P12, P14, P15, P19, P22, and P26). Learners applauded the use of "metaphor," "analogy," "story examples," and "the process of learning new things by the story." Participants rarely discussed AI concepts in their feedback outside of references to the storyline and story explanations. Our questions around the use of narrative as a tool for AI education overwhelmingly received positive responses.
Although not mentioned as consistently, several participants also expressed excitement for the inclusion of Armenian history and culture in the story. This reflected a trend where scenes with Armenian characters, settings, and subject matter perceived as culturally-authentic also received high enjoyment and learning scores. That said, sometimes Armenian references that wrapped AI content in cultural associations did not always land well with learners. For example, in one scene, a character visits Tsitsernakaberd, the memorial above Yerevan dedicated to the 1.5 million Armenians who perished in the Armenian Genocide. Here, one participant criticized that the location was only superficially discussed, and similarly noted that comparing AI machines to characters with culturally-significant roles could sometimes feel dehumanizing — “I didn’t like that Anoush is treated as a neural network” (P19). Both the author of this content and this reader self-identified as Armenian. Other Armenian learners did appreciate the same content though, highlighting how learner-made content covering sensitive cultural topics can be received very differently.
Beyond culturally-centered storytelling, scenes that introduced new characters out of the blue or that adopted a specific style of genre (e.g., fan-fiction, horror, etc.) tended to receive poorer reviews overall. For example, a scene titled “Morgana explains a gradient,” ventured into fan-fiction based on Knights of the Round Table lore, and it received the lowest emotional enjoyment rating of all platform content (mean: 2.33/6, std dev: 0.94). In post-study feedback from surveys and interviews, the majority of learners disliked these types of scenes because even when they contained “fun elements” (P1) or “relevant AI content” (P9), their tone felt out of place with the broader narrative to which these scenes were contributing — e.g., “It takes away from the beauty of the game with ties to Armenian culture” (P19). Such comments were evenly expressed by learners who identified as Armenian and non-Armenian.
At the same time, some learners very much enjoyed creating these styles of scenes and were not discouraged in doing so even if the scenes received little to no traffic on the platform. These findings again indicate the importance of designing narrative learnersourcing experiences that balance community preferences with opportunities for individual creative freedom.

5.4 Examining gender differences in learning, engagement, and experience

Across a number of our quantitative metrics and qualitative feedback, gender seemed to play a role in learning outcomes, engagement levels, and overall user experience.
Figure 8:
Figure 8: Female learners had a greater improvement in mean score from pre to post AI knowledge exam.
Table 5:
MetricMeasured byFemale
Mean
Male
Mean
Female
Median
Male
Median
Female
SD
Male
SD
p-value
(t-test)
Cohen’s
d
Baseline CS knowledge% of questions correct
on pre-study knowledge
assessment
36%40%43%30%22%24%.620.20
          
Baseline self-efficacyPre-study self-efficacy
assessment, scaled to
[0, 1] range
0.530.510.530.550.090.12.530.25
          
Knowledge gainsDifference in % correct
scores on pre to post
knowledge assessments
32%17%33%15%21%19%.050.81
          
AttentivenessTime spent per word (sec)2.80.021.40.0024.40.03.040.87
          
EngagementTime spent overall (hr)26.423.711.54.9340.649.6.880.06
Table 5: Comparing female and male learners on assessed, self-reported, and logged metrics of learning and engagement.
First, we observed a pattern where female learners appear to have slightly higher learning gains compared to male counterparts. For this reason, we have split results for female and male participants in Tables 1 and 2. This difference is also illustrated by Figure 8. To better understand these trends, we analyze female and male learners in terms of various objectively assessed, self-reported, and automatically logged measures of knowledge and engagement, including to check for any statistically significant differences between these two sets of learners. Table 5 summarizes results.
We saw no significant gender differences in prior knowledge at baseline (as previously noted in section 5.1.1) nor self-efficacy at baseline, as well as negligible effect sizes for each of these comparisons, suggesting that any observed differences in post-study assessments is attributable to experiences on the platform.
Figure 9:
Figure 9: Female learners tended to have higher attentiveness scores compared to male learners.
We did observe noticeably higher female attentiveness, as illustrated in Figure 9. This may be due to male participants reading faster than female participants or male participants skimming some story content, although the literature on gender differences in text processing speed suggests the former interpretation is unlikely. Specifically, relevant work has shown that females outperform males in alphabet-related processing speed tasks as well as other reading and writing tasks [105]. It is also worth noting that our measure of engagement (time spent on the platform) shows a less pronounced gap between female and male learners (see Figure 10).
Based on our qualitative analysis, many male learners did consider the narrative interesting, though non-essential to the learning process. Some even saw it as a distraction. This is not to say these learners did not find the platform meaningful; on the contrary, UES-SF metrics for both genders were roughly comparable (see Figure 11). Rather, we take these findings as indicators that female and male learners used and benefited from the stories in different ways.
Specifically, female learners tended to self-identify with the story’s main and supporting characters (nearly all of whom were also female) and explored AI concepts by focusing on aspects of characterization. On the other hand, male learners did not self-identify much with the characters, but they were nonetheless interested in the narrative and focused more on the provided metaphors to make sense of AI concepts. To help contextualize these general observations, we look at two participants, P3 and P17, as vignettes.
Figure 10:
Figure 10: Engagement time on the platform was relatively similar for female and male learners.
Figure 11:
Figure 11: The user experience ratings by female and male learners were relatively similar.
P3 identified as female, was 17 years old, and lived in the US. She reported minimal CS experience and no connection to Armenia. She learned about the study through a future college classmate. After joining the study and engaging with the platform, P3 met many other learners who, like herself, were considering a STEM degree. The system’s community aspects were a major draw for her and kept her highly engaged. In her words, “Where I felt most strongly about community was seeing everyone else’s code and thinking, ‘Wow, this was created by multiple people, even if I didn’t meet those people.’"
P3 also noted that learning through stories and then creating stories herself made the learning process much more approachable and enjoyable. She also found pleasure in learning about a culture different from her own: "I like [the cultural and storytelling] aspects a lot. At least from my point of view, you don’t really hear a lot about Armenian culture and inside someone’s view of it. And I love learning a lot about different kinds of cultures.”
These stories also made the AI topics more legible to P3: “Being able to metaphorically make connections to the topics made learning the content much easier. I would be like, ‘I don’t get this,’ but then in the next sentence, I would be relating to something that Carol [a story character] would be talking about – and I’d be like, ‘Okay, I understand it!’ The statues scene was my favorite, when Carol was admiring the statues, pointing out different aspects, and stating what they meant to her, beyond just being ’stone’ or ’marble.’” Further, P3’s perspectives illustrated how narrative learnersourcing had a powerful impact on her attentiveness and self-efficacy, as stories felt gripping and relatable. We found P3’s experience to be representative of many other female participants’ reactions to the platform.
P17 self-identified as male, was 18 years old, living in Greece, and self-reported significant prior CS experience from extracurricular courses. At first, he described the stories as more suitable for younger students, calling them “childish” and “a pointless thing to read.” At the same time, P17 very much enjoyed creating new story content for the platform. He found level 4 especially interesting: “Level 4 was interesting because there were the ethical concerns, and it was more for real life situations and problems that I might face. It was a great reminder that we need to think about those issues. They are very important.” P17 said he did not consider the community aspects of the platform to be personally very important to him, though he believed others might find that more meaningful. His perspectives, while skewed slightly more negative, showed similarities to many other male learners who we engaged with throughout the study, illustrating the need for systems like ours to support story creation, exposure to broader AI considerations such as ethics, and other features that resonate more deeply with users like P17.

5.5 Speaking to scalability of the platform

Much of our motivation in developing a learnersourcing platform is this approach’s promise in promoting a virtuous cycle of engagement, contribution, and maintenance that can enable the community to sustain itself over time as it grows to reach and benefit increasingly broad groups of learners. Here we examine important scalability challenges, including the cold start problem (having sufficient content when the platform is relatively new), issues of content quality, and how new users can become aware of the platform through organic, learner-driven referrals.

5.5.1 Tackling the cold start problem through human-AI collaboration.

When considering scalability, it is important to understand the early stages of a platform’s lifecycle, including how much content needs to be pre-seeded when the platform is first activated in order to provide a sufficient initial foundation on which early members can then build. For this study, we seeded the platform with learner-made content sourced through our three preliminary case studies, which are described in section 3.3.
A main difference between our case studies and final evaluation study was that we did a considerable amount of hand-holding of learners in the case studies (precisely to get around problems like writing stories without a lack of prior content to build on), but we took a fully hands-off approach in the final study, letting learners act as facilitators for each other. Our experience throughout these phases indicates that creators of new instantiations of a narrative-based learnersourcing platform like Apricot Stone City will need to work with early adopters to draft initial content and/or produce it themselves. Then once that ball is rolling, we saw that learners are enthusiastic about keeping the contributions coming.
To explicitly gauge such contributions made to our platform, we calculated the number of changes made by story architects (total number of scene contributions: 104, mean: 8, median: 6, min: 1, max: 20). The average word count per story contribution was 590, and the cumulative word count of all content on the platform was 88,636. These results represent a meaningful amount of work, particularly when considering that all story content was in English yet many of our users were English as an Additional Language (EAL) speakers.
This level of output was supported by learners leveraging ChatGPT as a writing collaborator, which permitted acceptable quality and more consistent content creation by learners who were mostly novices in both writing and AI. In our interviews, we found that learners appreciated this story consistency, while feeling it was important that stories were co-written by a human rather than simply automatically generated by large language models (LLMs) or other forms of generative AI. The use of such tools as part of collaborative human-AI writing activities thus may be a promising avenue to promote scalability and minimize cold start issues, while preserving the essential aspects of user-driven content creation that are inherent to learnersourcing as a pedagogical strategy.

5.5.2 Gauging quality of learnersourced content.

A key issue in crowd-powered systems is ensuring the quality of user-generated content. To assess the quality of learnersourced content on our platform, we performed an expert review of all learner-created story scenes. We used criteria for gauging correctness and helpfulness (both on a scale of 1-5) based on Denny et al.’s evaluation of learnersourced content [27], averaging our independent scores per metric. For content correctness, we found a mean of 3.91, median of 4, and standard deviation of 1.03. For content helpfulness, we found a mean of 3.38, median of 4, and standard deviation of 1.41. These results indicate that a majority of learners were able to produce content on our narrative-based platform that is at least "satisfactory" in terms of teaching quality. Future steps can build on prior research about content quality in crowdwork, learnersourcing, massive open online courses, and other similar systems, in order to design strategies to both monitor and enhance quality level on narrative-based learnersourcing platforms specifically.
Figure 12:
Figure 12: Referrals made and accepted by users on our learnersourcing platform.

5.5.3 How learner referrals help scale and sustain the platform.

While we only analyzed data from our study cohort of N=27 recruited participants, we were interested in how much organic growth our platform could generate in just one week through unprompted, learner-driven word-of-mouth. During the course of the study, 105 users joined the platform and stayed active. (We consider users "active" if they successfully create an account and interact with at least one scene a day for three days. A referral is considered "accepted" only if the recipient becomes an active user).
Most of this userbase discovered our platform through referrals from friends or acquaintances. Figure 12 illustrates the flow of referrals accepted by new users. Each distinct arc on the circle’s circumference denotes one user, with the total circumference representing all 105 users active on the platform. The distance along the circumference covered by any particular arc represents the number of referrals that particular user made. Connections between the arcs inside the circle depict which users accept which referrals.
Analyzing the referrals, we observed that new users tend to join in groups linked to a specific referrer, with two users (P21 and P25) driving over half of all accepted referrals. Identifying and studying these highly motivated referrers would be a valuable next step to understand strategies for reproducing this effect. Surprisingly, the main distinguishing factor of our largest "super referrer" was not prior knowledge or experience in computing but rather his enthusiasm and belief in the concept of narrative learnersourcing, which bodes well for its appeal factor and potential for uptake.

6 Discussion

To promote learning experiences for AI literacy that are inclusive, immersive, and personally meaningful, our research explores a strategy that combines learnersourcing with the intuitive, engaging benefits of narrative-based learning. Specifically, our platform, Apricot Stone City, leverages culturally-specific metaphors to wrap AI concepts in an interactive narrative adventure.

6.1 Contributions and reflections

We followed an iterative, user-centered design approach spanning three formative studies with target users along with an extended deployment of the resulting system. These efforts also involved forming and maintaining a strong collaboration with our extracurricular program partner, TUMO, as part of connecting with learners to understand their needs, preferences, and experiences around novel learning environments.
Our specific contributions include:
An innovative narrative-based learnersourcing approach that uses stories as a vehicle for active, collaborative learning of AI concepts.
Instantiation of this design approach in a usable online platform, with features and functionality informed and refined through multiple rounds of learner-centered design.
Rich findings from a deployment study that demonstrate narrative-based learnersourcing can promote knowledge gains, topic interest, and positive learner experiences, including for novice learners and those from backgrounds traditionally underrepresented in computing and STEM fields.

6.1.1 Community partners and culturally-situated narratives.

Our research was done in partnership with TUMO, which provided supporting resources, far reaching access to study participants, and real-world extracurricular learning contexts. It would be worthwhile for future work to consider testing learnersourcing systems in partnerships with other types of organizations, such as smaller-scale extracurricular efforts, formal educational settings, and community-based organizations that engage with additional age ranges.
Given TUMO’s roots in Armenia and our initial case studies in Armenian contexts, culturally-situated storytelling became an important foundation of our approach to narrative-based learnersourcing. Our design process evolved to center cultural expression as a key aspect of the learning experience, with our formative design phases exploring cultural and cross-cultural factors through in-depth, direct engagement with learners and their communities. These findings align with work showing narratives in online communities can be amplifiers of participation, shared values, fulfillment, and emotional connection [37]. In engaging with such issues, our work spans sociology, pedagogy, and narratology. We hope our research inspires additional interdisciplinary scholarship at the intersections of HCI and broad humanistic and technical domains.

6.1.2 Narrative-based learnersourcing helps build inclusive learning communities at scale.

While this paper focused on the deep and extended human-centered design process to develop a narrative-based learnersourcing system and verify its preliminary effectiveness, an important next step is establishing the approach’s scalability. Our findings do indicate that users are able and motivated to learn from and build on prior learnersourced content.
We also saw promise in the organic growth of our platform’s userbase through word-of-mouth referrals. The emergent behavior we observed in our referral patterns aligns with prior HCI research on "super users" as exceptionally influential [134]. Such trends could also be an early indication of preferential attachment — a property where nodes in a graph acquire new links at a rate relative to the node’s existing degree. Colloquially called the "rich get richer" effect, this mechanism has contributed to the growth of large social networks and crowdsourcing projects, including Wikipedia [21, 54].
Taking a step back, we reflect on why scalability matters. In our study, we observed that scaling the platform results in not only more content quantity, but also higher content quality as more learners become facilitators. In turn, new and improved content kept learners engaged and motivated their outreach to new generations of users. Similarly, we observed that the sense of community flourished and persisted across the four learner cohorts (the 3 case studies plus the evaluation study) based on post-study interviews as well as on-platform interactions (e.g., helping each other debug code, learn GitHub, or troubleshoot the development environment).
Lastly, by conducting our case studies and final study in different learning settings, we observed that the platform could be effectively employed in a range of contexts. It would be desirable for future work to explore the value of narrative-based learnersourcing in a variety of other non-formal and formal learning environments [55].

6.1.3 Narrative as a tool for empowerment.

In addition to the positive outcomes we saw for learners across our study sample, we observed that narrative-based learnersourcing can be an effective means of promoting sustained engagement, AI mastery, and self-efficacy for young women specifically. Such findings are especially significant given women are traditionally underrepresented in computer science and strategies are needed to enhance girls’ interest and retention in the field [43, 46, 77].
Our work additionally responds to "adult-centrism" issues raised by Prilleltensky et al., who claim that most work interprets the realities of young people from the point of view of an adult, thus depriving young people of power [99]. When considering Prilleltensky’s terminology, our work illustrates how narrative-learnersourcing could empower young people by extending access and control of the "key dimensions of power" — namely, access to valued resources, opportunities for participation and self-determination, and opportunities for the development of competence and self-efficacy.

6.1.4 Entering the age of AI.

Finally, we situate our work in relation to AI literacy education as a movement. Prominent advocates of AI literacy curricula are not only interested in responding to but also shaping the trajectory of technological and social development. This position was articulated by Stefania Giannini, the UNESCO Assistant Director-General for Education, in her 2023 report on generative AI and the future of education:
“In our environment of AI acceleration and uncertainty, we need education systems that help our societies construct ideas about what AI is and should be, what we want to do with it, and where we want to construct guardrails and draw red lines. Too often we only ask how a new technology will change education. A more interesting question is: How will education shape our reception and steer the integration of new technology – both technology that is here today and technology that remains on the horizon?” [44].
We designed our system based on an understanding of the connectedness between these two questions. That is, we position learners as both consumers and creators of learning content, thereby supporting their ability to actively bridge gaps between these roles by employing conceptual and technical critical thinking and by leveraging the connective tissue of narratives.

6.2 Limitations and future work

In acknowledging limitations and associated opportunities for future work, we first note that learnersourcing systems have well-documented content quality challenges, such as inaccurate or unhelpful learner-made resources [96]. Our system’s design choices aimed to promote quality, for instance through learner-driven content evaluations (e.g., post scene ratings) and control mechanisms (e.g., code reviews and bug reports to facilitators). However, evaluating the relative effectiveness of each of these mechanisms was not the primary focus of this paper. Although our expert review indicated that learner-made content was generally correct and helpful (see section 5.5.2), more work is needed to unpack the many design questions related to quality control.
Next, it is worth pointing out that our system’s logging capabilities allowed us to reliably assess many on-platform behaviors; however, some sub-tasks involving content creation (e.g., writing code locally) were more difficult to track. During in-person case studies, our direct observation could overcome this issue, but we had to rely on self-reported data during our final evaluation study. Future development work could address such limitations by implementing more comprehensive tracking support to enhance scientific investigations, while staying sensitive to learner privacy, data security, and best practices around disabling tracking to avoid unnecessary surveillance of users.
Finally, we chose a 1-week period for our evaluation study given that duration reflected the standard length of many of TUMO’s extracurricular programs, plus we wanted to understand whether knowledge, attitudes, and user experiences could be meaningfully shaped in this relatively short timeframe. While we did find promising outcomes along these lines, it is critical to undertake more extended studies to investigate novelty effects, sustained engagement, and ultimate scalability of narrative-based learnersourcing.

7 Conclusion

This paper presented a pedagogically-grounded, narrative-based learningsourcing approach to teaching AI, with a focus on extracurricular contexts. Following an iterative, user-centered design process involving three preliminary case studies in multi-cultural settings, we instantiated our approach in a web-based platform named Apricot Stone City. Evaluating our system in a deployment study with N=27 participants, we find that users experienced a significant increase in AI knowledge, showed positive shifts in self-efficacy over the course of the study, and were motivated to engage and re-engage with increasingly advanced AI concepts. Measures of user experience indicated that the narrative elements were a main contributor to meaningful engagement on both cognitive and emotional levels, and participants expressed a strong sense of community on the platform. Wrapping AI concepts into story content also made the learning experience more approachable for many users, particularly female learners and those with little prior experience with computer science. Based on our insights, we offer reflections on the value of community partners and culturally-sensitive narratives in creating inclusive learning experiences, and we encourage the HCI community to see narratives as an empowering educational tool as we enter the age of AI.

Acknowledgments

We would like to thank the TUMO staff, other project collaborators, and express our deepest appreciation to the learners who participated in this research. We also acknowledge the support of the Dartmouth PhD Innovation Program.

A Appendix

A.1 Preliminary phases of system design

To iteratively design, implement, and gather feedback on our system, we undertook three case studies with learners in a variety of extracurricular programming formats offered by TUMO. Our first case study focused on exploring learnersourced creation of narratives, the second focused on supporting collaborative engagement, and the third focused on scalability and the potential for learnersourced narratives to be effective and engaging across different cultural contexts. Identifiers for case study participants use the format Cx-y (e.g., C1-1, C1-2, C1-3, C2-1, etc.) — see the table in A.8 for each participant’s ID, age, gender, and case study number.

A.1.1 Case study 1 (C1): Creating an initial prototype, developing narrative-based learning content, and exploring learnersourcing-based scaffolds.

With our first case study, we aimed to build out our initial prototype, develop narrative-based learning content, and explore what sorts of scaffolds were needed to help sufficiently guide learners with little to no prior programming or AI knowledge through a successful learnersourcing experience.
C1 was conducted as an online, 3-week workshop in partnership with TUMO, who assisted in recruiting N=18 learners who ranged in age from 12–17 (9 female, 9 male). All participants had little to no prior coding experience. The first author served as a live facilitator, who developed lesson plans about fundamental AI concepts including machine learning (supervised, unsupervised, and reinforcement algorithms), classification and prediction, model training, testing, internal representations, neural networks, feature sets, and bias. For each concept, a short lecture, educational handout, and activity were created. In prior years, the first author taught multiple other workshops on AI with TUMO and drew on this familiarity and past content. The second author assisted in facilitation and drew on her professional experiences with these AI topics.
The workshop began with an overview of narrative structure, storytelling techniques, and choose-your-own-adventure style games, where learners explored how a linear narrative could be translated into a graph structure with multiple different possible endings. Using this notion of a “story graph” to characterize a narrative as a set of nodes (story content) and edges (transitions among these story scenes and plot points), the authors introduced students to concepts of graph structure, first in plain language and then in terms of code.
The remainder of the workshop focused on participants devising their own stories to convey AI concepts. Specifically, the first author would deliver a lecture on an AI concept and utilize the associated educational handout and activities. Participants were instructed to complete worksheets created by the authors to develop character profiles and map out plot arcs that weaved in the AI concept. This approach was based on Movement Oriented Design, which offers principles for developing educational multimedia narratives that are emotionally engaging and high quality [111]. As an example, after the session on k-means clustering, C1-9 wrote a story where a wizard challenges the main character to sort the items in a massive bag. In another story, a tourist photographs a series of statues during a day of sightseeing and writes their names on the back. To demonstrate concepts related to training data and classification, the story then follows the character as she visits a new part of town with different statues of the same historical figures and tries to guess their names based on her photographs.
Next, participants were tasked with implementing their narrative content as interactive functionality using our initial system prototype. We saw that participants, and particularly those with little prior coding experience, benefited from creating narrative-based content in stages (e.g., graph level summary, then wiki, then code), which resonates with literature on digital storytelling in education [126]. Participants were enthusiastic about the interactive stories they had made and displayed genuine delight with employing stories as a vehicle for learning. Notably, they actively shared the platform with their friends and family out of enthusiasm to show their work to others.
Regarding design implications for the narrative specifically, we found participants perceived the stories that blended fiction and personal experiences as the most interesting. Narratives that featured local contexts, such as places in or people from Yerevan were especially exciting to them. We did observe that participants often struggled to generate story content from scratch, so having prompts that invited them to source stories from everyday experiences was helpful. Further, participants appreciated that the story aspects of the learning experience pushed them in directions where they may have originally lacked self-confidence. That is, we observed that a number of learners who rated themselves at the start of the study as strong technically but weak in terms of storytelling appreciated the story aspects by the end of the study, and vice versa for students originally confident in storytelling but feeling less sure of their technical capabilities. Such findings reinforce ideas that narratives are an accessible, approachable vehicle for learning, including about concepts where learners initially feel intimidated.
After C1, our key open questions related to how learners might effectively collaborate and interact with one another on the platform, and whether it is viable for such a platform to enable not only virtual engagement from distributed learners but also in-person engagement from co-located sets of learners. We therefore undertook our next case study to explore these questions in preparation for our final evaluation.

A.1.2 Case study 2 (C2): Refining designs to provide opportunities for learners to collaborate and interact on the platform as well as engage in-person.

For C2, we recruited N=31 learners learners who ranged in age from 15–19 (21 female, 10 male). All participants had little to no prior coding experience. While C1 helped confirm that narrative-based learning content can be exciting for learners to utilize and create, it also exposed a need for additional scaffolding to help these learners better collaborate on these efforts. C2 therefore focused on these interpersonal considerations, along with whether our approach could remain inclusive to diverse extracurricular learning setups, particularly those with in-person components. Specifically, C2 was a 3-week workshop held in-person in Yerevan, Armenia and conducted in partnership with TUMO, who assisted in recruiting.
Before starting the workshop, we improved support for collaboration on our prototype through better ticket tracking and documentation in the story graph component (mimicking features and processes from popular Agile development tools like Jira). For instance, to manage synchronization between the wiki and codebase, we added a procedure for ownership tagging to the guidebook.
We found that participants were able to utilize all of these features with negligible confusion, which could be easily resolved by the facilitator or a peer. Further, we observed that learners appreciated having structured, systematic processes to follow, and that these processes did not inhibit their creativity. The need for content moderation did come up several times during the workshop, such as when participants added age-inappropriate language or other content. We approached this issue by denoting two volunteers who had reached the role of facilitator to be content moderators, charging them with reading through all updates to the wiki at the end of each day and bringing to our attention anything that they deemed problematic. If we requested changes, the moderators would create tickets accordingly; such roles could foreseeably be baked into the platform. In general, we responded to emerging issues like these by building out processes on the platform for responsibilities that learners could take up themselves, rather than by directly intervening such as changing content ourselves.

A.1.3 Case study 3 (C3): Investigating the scalability of the platform, particularly across cultural contexts, and whether students can build on each other’s contributions even over shorter engagement timeframes.

For the third case study (C3) we recruited N=16 learners learners who ranged in age from 14–17 (8 female, 8 male). All had little to no prior coding experience. C3 focused on whether users could in fact build on each other’s contributions to continue expanding out and sustaining the platform, including when learners were from different cultural contexts. The fact that C2 took place in Yerevan, Armenia and that all C1 and C2 participants identified as Armenian shaped those workshops and the resulting content. Students centered their Armenian identity within the created narratives, including cultural history and points of Armenian pride.
Given this embedding of cultural identity into learners’ created content, C3 explored scalability to a different learner demographic, from a different geographical region, with potentially minimal cultural common ground. We were particularly motivated to examine this question given expressions of enthusiasm from C1 and C2 participants in not only developing content with cultural significance but also for that content to be broadly shared. For example, C2-8 shared that in her favorite part of one story, the plot described Armenians as being known as “warm and hospitable people” and how that made her feel good knowing that “people from other countries will read about it.” She emphasized that this “made the project more exciting.” C2-2 felt strongly that she didn’t “want Armenia to only be associated with Kim Kardashian” but instead she wanted “it to be associated with the history, and the fun stuff too.” A few participants expressed that they were motivated by the cross-cultural possibilities of sharing the project with learners from other countries; C2-29 remarked it could be interesting for those learners who “might not get another opportunity to learn about Armenia.
Further, while C1 and C2 lasted 3 weeks, we wanted to understand if learning benefits could be observed after a more brief experience on the platform. C3 was therefore run as a 1-week workshop in Berlin, Germany. Through an onboarding survey, we verified there was very little cultural awareness or connection to Armenia for C3 participants, despite the mutual affiliation with TUMO.
This workshop began with a review of the existing prototype. Specifically, after C3 participants engaged with the platform, they wrote what they felt worked well and not well on sticky notes, which we then grouped using affinity diagramming. From resulting clusters, we created a list of aspects to keep as-is, along with suggested additions and other changes, which we then collaboratively ranked by importance. We had conducted the same exercise at the end of C2 (i.e., on the same prototype that C3 participants assessed), and our comparison of the recommendations from the C2 and C3 participants revealed few differences. Both groups focused on improvements related to the desire for style consistency across narratives, more options on the wiki for architecting out story structure, and better ways to integrate teaching lessons into the narrative content. These results were encouraging, as they demonstrated a consistency between the two groups of learners.
Regarding the narrative, we were additionally pleased to see that the local cultural references were well-received by the C3 learners. In interviews, these participants reported that they were excited to learn more about and participate in stories about Armenia, with this content seen as “more intentional” and “less random”. These findings indicate that drawing stories from real-world, localized, and cultural experiences tended to result in more compelling narrative content, even for learners from outside those regions and cultures.
In C3, learners’ baseline exposure to computing was much lower compared to C1 and C2. In this group of participants, most had never taken a computer science course and found the wiki and its associated processes much more intuitive than working directly in the codebase. That said, within two days, participants were able to contribute to the content and began addressing some of the areas of improvement they had identified in the initial review. Given the C3’s condensed timeframe, we observed that learners focused on augmenting existing narrative content (e.g., fixing plot holes, making stylistic improvements, and developing small spin-off stories), demonstrating their ability to build on other users’ contributions and support a cycle of improvement and growth on the platform.

A.2 Technical details of the narrative-based learnersourcing platform

As mentioned, our system consists of three main components: the story adventure, the story graph, and the story infrastructure. Here we provide technical details about these components.

A.2.1 Story adventure component.

The story adventure component provides a web application to enable a broad base of learners to engage with story content. Specifically, the component supports state-dependent concurrent sessions, a chatbot-esque choose-your-own-adventure story experience (where choices can be either free response text or multiple choice), progress tracking, and an animated 3-dimensional graph-based story navigation system. These capabilities are enabled by three web views: an "adventure view", "graph view", and "progress view". A discussion of the backend can be found in A.2.2 on the story graph component.
Adventure view supports an infinite scroll of multimedia, links, and text messages with an interactive display that supports input via multiple choice selections and free text responses. Graph view displays each story scene with a representative picture (e.g., a church, if the scene relates to visiting a church) floating in three-dimensional space (projected using a force-graph calculation). The pictures are connected according to parent-child relationships between the scenes. A learner can navigate the interface with standard, mouse-based zoom, rotate, and translate controls (e.g., similar to Google Earth). The learner can select a scene by clicking on its picture to see an interactive pop-up summary. Using this summary, the learner can return to scenes in adventure view (provided they have already completed the scene). The progress view displays information about the learner’s learning progress. This includes progress toward completing level requirements and statistics related to story scene creation and consumption.

A.2.2 Story graph component.

The story graph component has two objectives: supporting the story adventure and enabling new content creation. The first is accomplished via an exposed API from the story adventure frontend views. Powering this API is a Node.js server and Firebase database that we collectively refer to as “the storyteller.” The storyteller is a finite-state-automaton that maintains a graph representation of story content and a history of every learner’s “story state” as determined by his or her story interactions. It uses these, when provided with a learner’s latest interaction, to respond to “scene continuation” API requests. The second objective is a balancing act, as it involves making the barrier to content creation low for novice learners while supporting advanced learners’ technical creative expression. This is accomplished by dividing the content creation process into three discrete stages, each involving a progressively lower-level tool; these are the “graph view create GUI,” the “story wiki”, and the "storyteller content graph.”
The "graph view creates GUI" is where learners initiate the process of creating or modifying a scene. This entry point is accessible to learners who have just completed level 1 because it shares the same web interface (as an unlockable “architect mode”). The GUI flow asks users to fill out a template describing the change they are making, then directs them to the guidebook (discussed next in A.2.3) for guidance on creating content and using the story wiki and the storyteller content graph.
The story wiki is a collection of Google Docs, where each doc corresponds to a particular scene, used for drafting and testing story content. Each doc is drafted via two steps. First, a scene is drafted at a conceptual level, using scene and character archetype templates. This is provided to ChatGPT as a contextual pre-prompt, using a process outlined in the guidebook. The learner is then guided through fleshing out the template into a non-linear story script using exchanges with ChatGPT. Steps in prompt formulation are structured (e.g., the GPT-3 and GPT-4 models both included AI4ALL and "5 Big Ideas in AI" in their training data, so targeted references to this content by name generally results in reasonably well-crafted responses), though critical thinking is still required to check facts and keep narratives consistent. As story content is created, it is represented as nodes (using bullet points) with further indented multiple choice options and rules-based free response interactions representing the edges as hyperlinks. Internal links to a particular node on a page are created with header refs. More complex interactions such as updates to state variables, mini-games, or real-world activities are simply described in plain text.
The storyteller content graph is where content is implemented as code. Once a content change has been staged on the wiki, a first-time architect is directed to clone the storyteller Node.js from the GitHub repository and run the server locally to test her changes. This is facilitated by the guidebook (via a video tutorial). While popular closed- and open-source options exist for finite-state-automaton, we chose to code the server from scratch to meet the particular integration demands and desired capabilities of our system. Specifically, our codebase compartmentalizes functionality into three modules: 1) the core state-based automata functionality, 2) state management, and 3) story content. The complete abstraction of this complexity via library functions designed as wrappers for our wiki content was the key to helping learners with little to no prior coding experience write code. Using these custom library functions, moving content from the wiki to the codebase became simple. We repeatedly observed that setting up VSCode was often the most challenging aspect of architect onboarding for complete novices. In particular, TypeScript Types helped learners identify and debug errors in their content formatting pre-compilation. A custom set of pre-compilation tests we wrote (e.g., to ensure that every edge actually links to an existent node) was also useful for learners. For novice learners advancing to levels 3 and 4, our library provided simple approaches to the requirements of mini-game embedding and real-world engagement. For more advanced learners, representing story content as data objects wrapped in code (rather than as data in Firebase) had an advantage in allowing exposure to complexity on demand. For instance, the library enabled more advanced learners to access the full functionality of our finite-state-automaton (e.g., to run NLP APIs on free text input using user state variables as a reference) using native TypeScript with full Type support.

A.2.3 Story infrastructure component.

The story infrastructure component is the underlying codebase and learner-driven design and maintenance processes that enable architects to create content that meets the needs of explorers. This component’s primary features are the frontend codebase, storyteller state management and core functionality, analytics view, and guidebook. We focus here on the analytics view and guidebook.
Analytics view is a page in the web app that becomes accessible once a learner completes level 4. It contains a dashboard with summary statistics about the community’s interaction activity, feedback, and quiz and assessment results on each story scene. Though we did not actually enable it for learners for privacy reasons, we do support tracking learners’ low-level actions. Our dashboard can display this data in aggregate assessments and time series views. To log events, we used popular Node packages like IdleTimer.
The guidebook is a Google Doc and accompanying lecture series we created involving six videos. The guide is organized into sections describing architect and facilitator roles in terms of objectives, processes, external resources, and interfaces that should be utilized to meet these goals. Any learner can suggest edits to add content to the guidebook or correct errors.

A.3 Example knowledge assessment questions

The following multiple choice questions are provided as examples of our isomorphic pre- and post-study knowledge assessments. The correct answer to each question is bolded and listed first.
Which algorithm/model from the following list would most likely be appropriate for separating a dataset into 5 clusters?
-
K-means
-
Decision tree classification
-
Support vector machine
-
Naive Bayes classification
-
I don’t know
Which of the following is most suitable for classifying emails as either spam or not spam? (Assume that you already have a dataset of emails, with some that you know were marked by users as "spam".)
-
Supervised learning
-
Unsupervised learning
-
Reinforcement learning
-
None of the above
-
I don’t know
What is the purpose of k-fold cross-validation?
-
To assess performance of the model on unseen data
-
To improve the quality of training data
-
To label training data for supervised learning
-
To mix training data with test data to improve performance
-
I don’t know
Which of the following statements best describes overfitting in the context of machine learning?
-
Overfitting occurs when a model performs well on the training data but fails to generalize to new, unseen data
-
Overfitting is the result of a model being too simple, leading to poor performance on both training and test data
-
Overfitting is a desirable outcome, as it indicates that a model has learned the underlying patterns in the data effectively
-
Overfitting is a term used to describe the process of training a model on a diverse set of data to improve its overall performance
-
I don’t know
How might data visualizations help in identifying potential biases in training data? (Select the most relevant response)
-
By highlighting imbalances based on demographic variables
-
By representing data in visually appealing ways
-
By providing insights into feature correlations
-
By visualizing training time vs. model accuracy
-
I don’t know

A.4 Example learner-made quiz questions

The following multiple choice questions are provided as examples of post-scene quizzes made by learners on the platform. The correct answer to each question is bolded and listed first.
In the "practice giving a speech" example, Lia was like a reinforcement learning model and she gave a great speech - how did she learn to get better over time?
-
By doing a smart form of trial and error and seeing what works best for her, then doing more of that thing
-
By calculating patterns in data
-
Through the analysis of labeled training data
-
By following her teacher’s instructions on how to give a great speech
-
I don’t know
In the cloud gazing scene, cloud formations represent the concept of supervised learning - why is that?
-
Carol imagines that each cloud shape has an ideal label, like a bunny or a cat, and that with some examples she could train her friend to guess the shapes she sees
-
Using a set of complicated logical rules in code, each cloud can be classified as a type of animal
-
When she looks at the clouds they resemble a child learning through trial and error
-
The interconnectedness of the clouds resemble a neural network
-
I have no idea
You looked at metaphors for a each neural network architecture - which one excels at processing images and extracting features?
-
2D convolutional network
-
Recurrent network
-
Generative adversarial network
-
Feed-forward network
-
I have no idea
Remember Carol’s interaction with the taxi driver? What part of training an AI model did her conversation and observations represent? (You’ll see small hints in the scene - go back if you need help)
-
Model evaluation
-
Feature selection
-
Data preprocessing
-
Model training
-
Idk, take me back to the scene!

A.5 Pre- and post-study survey questions

This section provides questions from our pre- and post-study surveys. Options are bold, with information about the metric in italics.

A.5.1 Questions asked pre- and post-study.

Academic engagement and self-efficacy
Select how much you agree or disagree with the following statements. [Strongly disagree, Disagree, Neutral, Agree, Strongly agree] (Likert scale, range: [1, 5])
-
I am motivated to learn about AI.
-
If I wanted to, I could potentially do very well in computer science.
-
I try to make connections between what I learn in different classes and experiences.
-
I put a lot of effort into the work I do.
-
Even when things are tough, I can perform quite well.
Self-efficacy
Select how much you agree or disagree with the following statements. [Strongly disagree, Disagree, Neutral, Agree, Strongly agree] (Likert scale, range: [1, 5])
-
I’m confident I can understand the basic AI concepts taught in this study.
-
I’m confident I can understand the most complex AI material presented in this study.
-
I believe I can do an excellent job on the AI-related activities and evaluations in this study.
-
I’m certain I can master the AI skills being taught in this study.
Task-value
Select how much you agree or disagree with the following statements. [Strongly disagree, Disagree, Neutral, Agree, Strongly agree] (Likert scale, range: [1, 5])
-
I think I will be able to use what I learn in this study in other classes.
-
It is important for me to learn the material in this study.
-
I am very interested in the topics of this study.
-
I think the material in this study is useful for me to learn.
-
I like the subject matter of this study.
-
Understanding the subject matter of this study is very important to me.
Interest value, utility value, attainment value
Select how much you agree or disagree with the following statements. [Strongly disagree, Disagree, Neutral, Agree, Strongly agree] (Likert scale, range: [1, 5])
-
I like AI.
-
AI is exciting to me.
-
I am fascinated by AI.
-
AI concepts are valuable to learn.
-
Being good at AI will be important when I get a job or go to college.
-
Being someone who is good at computer science is important to me.
Belongingness in CS, sense of social academic fit
Select how much you agree or disagree with the following statements. [Strongly disagree, Disagree, Neutral, Agree, Strongly agree] (Likert scale, range: [1, 5])
-
I feel a sense of belonging to the AI and computer science community.
-
I feel comfortable in computer science.
-
I feel like an outsider in computer science.
-
I identify as a computer scientist.
Readiness to learn
Select how much you agree or disagree with the following statements. [Strongly disagree, Disagree, Neutral, Agree, Strongly agree] (Likert scale, range: [1, 5])
-
I am good at setting goals and deadlines for myself.
-
I finish things I start.
-
I do not quit just because things get difficult.
-
I am relatively good at using the computer.
Storytelling enjoyment
Select the response you most agree with. [Not at all, A little, Quite a bit, Very much] (Likert scale, range: [1, 4])
-
How much do you enjoy reading stories in general?
-
How much would you enjoy reading stories as part of the study?
-
How much do you enjoy writing stories in general?
-
How much would you enjoy writing stories as part of the study?
-
What is your favorite book? [Free response]
Prior experiences with CS, AI, and storytelling
Select the response you most agree with. [Yes, No]
-
I have participated in a computing extracurricular activity before.
-
I have taken a computer programming or AI class before.
-
I have participated in a storytelling-related extracurricular activity before.
-
I have taken a class before that involved reading and/or writing stories.
Self-reported knowledge of AI topics covered on the platform
Reviewing the following statements, rate your familiarity with the concepts discussed. [I have written code that relates to this, I could explain this concept to a friend, I am somewhat familiar with this concept, I could guess what this means, I don’t know what these means] (Likert scale, range: [1, 5])
-
Define supervised, unsupervised, and reinforcement learning algorithms, and give examples of human learning that are similar to each algorithm.
-
Model how machine learning constructs a reasoner for classification or prediction by adjusting the reasoner’s parameters (its internal representations).
-
Use either a supervised or unsupervised learning algorithm to train a model on real-world data, then evaluate the results.
-
Illustrate what happens during each of the steps required when using machine learning to construct a classifier or predictor.
-
Describe how various types of machine learning algorithms learn by adjusting their internal representations.
-
Select the appropriate type of machine learning algorithm (supervised, unsupervised, or reinforcement learning) to solve a reasoning problem.
-
Train a multi-layer neural network using the backpropagation learning algorithm and describe how the weights of the neurons and the outputs of the hidden units change as a result of learning.
-
Compare two real-world datasets in terms of the features they comprise and how those features are encoded.
-
Evaluate a dataset used to train a real AI system by considering the size of the dataset, the way that the data were acquired and labeled, the storage required, and the estimated time to produce the dataset.
-
Investigate imbalances in training data in terms of gender, age, ethnicity, or other demographic variables that could result in a biased model, by using a data visualization tool.
-
How to use variables in a programming language
-
What JavaScript is
-
The difference between TypeScript and JavaScript
-
Using conditionals, like “if”, in code
-
Using loops, like “while” and “for”, in code
-
Using a map/dictionary data structure in code
-
Representing a graph using a data structure in code
-
Using git to push and pull commits
-
Using git to resolve a merge conflict
-
Using Node.js
Peer comparison
-
Compared to your peers, how would you rate your knowledge of computer science? [Much less knowledgeable, Somewhat less knowledgeable, Average, Somewhat more knowledgeable, Very knowledgeable] (Likert scale, range: [1, 5])
Demographics
-
How many years old are you? [Number drop down]
-
What is your gender? [Female, Male, Non-binary, Free response]
-
What is your nationality? [Free response]
-
What is your racial/ethnic identity? [Free response]
-
Do you identify as Armenian? [No, Yes, Free response]
-
Who (if anyone) referred you to this opportunity? [Free response]

A.5.2 Questions asked only post-study.

Reflections on the study
-
How important was it to you that the story focused on Armenia? Please explain. [Free response]
-
Did it matter that the story was/wasn’t relevant to your personal experiences? Please explain. [Free response]
-
What originally motivated you to sign up for this study? What kept you motivated to keep participating? Please elaborate. [Free response]
-
I would choose to participate in an AI learning community again. [Yes, No]
-
I feel that I am a member of the Apricot Stone City learning community. [Yes, No]
-
The Apricot Stone City learning community is supportive of me. [Yes, No]
-
Did you feel like there was a sense of community during the study? [Yes, No] Please explain. [Free response]
User Engagement Scale Short Form (UES-SF) [91] scoped to questions on focused attention and reward factors. [Strongly disagree, Disagree, Neutral, Agree, Strongly agree] (Likert scale, range: [1, 5])
-
I was absorbed in this experience.
-
My experience was rewarding.
-
I felt interested in this experience.
Extended Unified Theory of Acceptance and Use of Technology (UTAUT2) [125] scoped to questions on performance expectancy, effort expectancy, social influence, facilitating conditions, hedonic motivation, habit, and behavioral intention. [Strongly disagree, Disagree, Neutral, Agree, Strongly agree] (Likert scale, range: [1, 5])
-
This experience helped me learn.
-
Learning how to use the Apricot Stone City platform was easy for me.
-
People who are important to me would want me to engage with the platform.
-
I have the resources necessary to use the platform.
-
Using the platform was enjoyable.
-
Using the platform would become a habit for me.
-
I would intend to continue using the platform in the future if it’s available.
Self-reported learning per level
Please rate the following activities in terms of how much you feel you learned by engaging with them. (If you didn’t engage with a level, mark N/A).
[I didn’t find this at all valuable for learning, It was a little valuable for learning, It was very valuable for learning, N/A (I didn’t engage with this)] (Likert scale, range: [1, 3] or N/A value)
-
Level 1 (using existing content on the platform)
-
Level 2 (creating content)
-
Level 3 (coding mini-games)
-
Level 4 (bringing the concepts into the real-world, considering AI ethics)
-
After level 4 (supporting other content creators)
-
How far did you progress on the site (e.g., what level)? What helped you get that far? What prevented you from getting farther? Please explain. [Free response]

A.6 Post-scene feedback questions

A learner receives the following questions to provide feedback after completing a story scene.
Enjoyment and self-reported learning
-
How enjoyable was this scene? [Animated smiley emoticon Likert scale]14
-
How would you rate your learning for this scene? [Animated smiley emoticon Likert scale]
-
What did you like about this scene? [Free response]
-
What would you change about the scene? [Free response]

A.7 Focus group interview guide

Our semi-structured focus group interviews used the following questions as a guide, centered around our research questions.
Questions related to RQ1
-
Tell me about a time when you felt like you were learning something new and interesting on the platform.
-
What were some of the favorite new things you learned this past week? It could be AI concepts, or how to use some aspect GitHub – or anything else – whatever you enjoyed learning!
-
Now, how about a time you felt confused or found something hard to understand?
-
How did this experience compare to regular school and learning in a classroom with a teacher?
Questions related to RQ2
-
How enthusiastic were you about AI / computer science before this experience? Has that interest gone up or down?
-
Do you think you’ll take more AI or computer science classes in school or after-school programs if you have the chance? Why or why not?
Questions related to RQ3
-
Tell me about some of the things you liked the most about the platform/your experience?
-
Anything you really disliked or would want to change?
-
How did you feel about the stories on Apricot Stone City?
-
What was your favorite scene? Why?
-
Were there any stories you really disliked? How come?
-
Did you get to the point of writing story content? How was that experience?
-
Did you feel like you were part of a community on Apricot Stone City? Tell me about that.
-
How much did you interact with other learners on the platform?
-
What encouraged you or discouraged you from interacting with others?
-
Did you feel ownership of the platform content?
-
How did the bugs in the platform content make you feel? (e.g. empowered to fix them? / did you notice them?)
Questions related to scalability
-
If you had one more month to continue working on this platform, what would you like to see happen?
-
Do you imagine you’ll keep using the platform once the official study ends? Why or why not? It’s OK if the answer is no – we want to understand your honest reactions!
-
Would you refer the platform to a friend?
-
Could you see the platform being used in a classroom context? At home?

A.8 Participant information table

Please see the next page.
Table 6:
Participant IDAgeGenderGroup Participant IDAgeGenderGroup
C1-113FC1 C2-2916MC2
C1-214FC1 C2-3017MC2
C1-315FC1 C2-3118MC2
C1-415FC1 C3-115FC3
C1-516FC1 C3-216FC3
C1-616FC1 C3-316FC3
C1-716FC1 C3-416FC3
C1-816FC1 C3-516FC3
C1-917FC1 C3-616FC3
C1-1012MC1 C3-717FC3
C1-1114MC1 C3-817FC3
C1-1214MC1 C3-914MC3
C1-1314MC1 C3-1015MC3
C1-1416MC1 C3-1115MC3
C1-1516MC1 C3-1216MC3
C1-1616MC1 C3-1316MC3
C1-1717MC1 C3-1416MC3
C1-1817MC1 C3-1517MC3
C2-116FC2 C3-1617MC3
C2-216FC2 P114FStudy
C2-316FC2 P217FStudy
C2-416FC2 P317FStudy
C2-516FC2 P417FStudy
C2-616FC2 P517FStudy
C2-716FC2 P618FStudy
C2-817FC2 P718FStudy
C2-917FC2 P819FStudy
C2-1017FC2 P919FStudy
C2-1117FC2 P1020FStudy
C2-1217FC2 P1120FStudy
C2-1317FC2 P1220FStudy
C2-1417FC2 P1323FStudy
C2-1517FC2 P1416MStudy
C2-1617FC2 P1517MStudy
C2-1717FC2 P1618MStudy
C2-1817FC2 P1718MStudy
C2-1918FC2 P1818MStudy
C2-2018FC2 P1918MStudy
C2-2119FC2 P2018MStudy
C2-2215MC2 P2119MStudy
C2-2315MC2 P2219MStudy
C2-2415MC2 P2319MStudy
C2-2516MC2 P2421MStudy
C2-2616MC2 P2521MStudy
C2-2716MC2 P2622MStudy
C2-2816MC2 P2723MStudy
Table 6: Basic demographic information (age and gender) about study participants. The "Group" column indicates participation in one of the preliminary case studies ("C1", "C2", "C3") or the final system evaluation ("Study").

Footnotes

13
The AI topics covered by the platform are: The nature of learning (humans vs. machines), Finding patterns in data, Training a model, Constructing vs. using a reasoner, Adjusting internal representations, Learning from experience, Structure of a neural network, Feature sets, Large datasets, and Understanding bias in datasets [2].

Supplemental Material

MP4 File - Video Preview
Video Preview
Transcript for: Video Preview
MP4 File - Video Presentation
Video Presentation
Transcript for: Video Presentation
MP4 File - Video Figure
A video figure depicting our user journey.
Transcript for: Video Figure

References

[1]
Nancy E Adams. 2015. Bloom’s taxonomy of cognitive learning objectives. Journal of the Medical Library Association: JMLA 103, 3 (2015), 152.
[2]
AI4ALL. 2023. Open Learning Curriculum. https://ai-4-all.org/resources/.
[3]
Mary Ainley, Suzanne Hidi, and Dagmar Berndorff. 2002. Interest, learning, and the psychological processes that mediate their relationship.Journal of educational psychology 94, 3 (2002), 545.
[4]
Mohammed Abdullatif Almulla. 2020. The effectiveness of the project-based learning (PBL) approach as a way to engage students in learning. Sage Open 10, 3 (2020), 2158244020938702.
[5]
Lorin W Anderson and David R Krathwohl. 2001. A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives: complete edition. Addison Wesley Longman, Inc., New York NY United States.
[6]
James M Applefield, Richard Huber, and Mahnaz Moallem. 2000. Constructivism in theory and practice: Toward a better understanding. The High School Journal 84, 2 (2000), 35–53.
[7]
Qiming Bao, Juho Leinonen, Alex Yuxuan Peng, Wanjun Zhong, Tim Pistotti, Alice Huang, Paul Denny, Michael Witbrock, and Jiamou Liu. 2023. Exploring Self-Reinforcement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language Models.
[8]
Amna Basharat. 2016. Learnersourcing Thematic and Inter-Contextual Annotations from Islamic Texts. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, USA, 92–97.
[9]
Zeinab Bedri, Ruairí de Fréin, and Geraldine Dowling. 2017. Community-based learning: A Primer. Irish Journal of Academic Practice 6, 1 (2017), 5.
[10]
Matthew L Bernacki, Timothy J Nokes-Malach, and Vincent Aleven. 2015. Examining self-efficacy during learning: Variability and relations to behavior, performance, and learning. Metacognition and Learning 10 (2015), 99–117.
[11]
Kiran Bisra, Qing Liu, John C Nesbit, Farimah Salimi, and Philip H Winne. 2018. Inducing self-explanation: A meta-analysis. Educational Psychology Review 30 (2018), 703–725.
[12]
Courtney K Blackwell, Alexis R Lauricella, and Ellen Wartella. 2016. The influence of TPACK contextual factors on early childhood educators’ tablet computer use. Computers & Education 98 (2016), 57–69.
[13]
Benjamin S Bloom, Max D Engelhart, EJ Furst, Walker H Hill, and David R Krathwohl. 1956. Handbook I: cognitive domain. Addison-Wesley Longman Ltd, Saddle River, NJ.
[14]
Benjamin S Bloom and David R Krathwohl. 2020. Taxonomy of educational objectives: The classification of educational goals. Book 1, Cognitive domain. Addison-Wesley Longman Ltd, 1 Lake St Upper Saddle River, NJ 07458.
[15]
Phyllis C Blumenfeld, Elliot Soloway, Ronald W Marx, Joseph S Krajcik, Mark Guzdial, and Annemarie Palincsar. 1991. Motivating project-based learning: Sustaining the doing, supporting the learning. Educational psychologist 26, 3-4 (1991), 369–398.
[16]
Charles C Bonwell and James A Eison. 1991. Active learning: Creating excitement in the classroom. 1991 ASHE-ERIC higher education reports.ERIC Clearinghouse on Higher Education, Washington, DC.
[17]
Alexios Brailas. 2020. Rhizomatic Learning in action: a virtual exposition for demonstrating learning rhizomes. In Eighth international conference on technological ecosystems for enhancing multiculturality. Association for Computing Machinery, Salamanca, Spain, 309–314.
[18]
Nathan R Bromberg, Angsana A Techatassanasoontorn, and Antonio Díaz Andrade. 2013. Engaging students: Digital storytelling in information systems learning. Pacific Asia Journal of the Association for Information Systems 5, 1 (2013), 2.
[19]
Matthew Budman, Blythe Hurley, Nairita Gangopadhyay, and Anya Tharakan. 2020. Talent and workforce effects in the age of AI.
[20]
Nanci M Burk. 2000. Empowering at-risk students: storytelling as a pedagogical tool.
[21]
Andrea Capocci, Vito DP Servedio, Francesca Colaiori, Luciana S Buriol, Debora Donato, Stefano Leonardi, and Guido Caldarelli. 2006. Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia. Physical review E 74, 3 (2006), 036116.
[22]
Harun Cigdem and Mustafa Ozturk. 2016. Critical components of online learning readiness and their relationships with learner achievement. Turkish Online Journal of Distance Education 17, 2 (2016).
[23]
Donald Clark. 2020. Bloom (1913-1999) - Mastery learning. Taxonomy of learning: not a hierarchy. Donald Clark.
[24]
AnneMarie M Conley. 2012. Patterns of motivation beliefs: Combining achievement goal and expectancy-value perspectives.Journal of educational psychology 104, 1 (2012), 32.
[25]
Robert J Crutcher and Alice F Healy. 1989. Cognitive operations and the generation effect.Journal of Experimental Psychology: Learning, Memory, and Cognition 15, 4 (1989), 669.
[26]
Ali Darvishi, Hassan Khosravi, and Shazia Sadiq. 2021. Employing peer review to evaluate the quality of student generated content at scale: A trust propagation approach. In Proceedings of the eighth ACM conference on learning@ scale. Association for Computing Machinery, New York, NY, United States, 139–150.
[27]
Paul Denny, Hassan Khosravi, Arto Hellas, Juho Leinonen, and Sami Sarsa. 2023. Can We Trust AI-Generated Educational Content? Comparative Analysis of Human and AI-Generated Learning Resources.
[28]
Cheryl Diermyer and Chris Blakesley. 2009. Story-based teaching and learning: Practices and technologies. In 25Th Annual Conference on Distance Teaching and Learning. 25th Annual Conference on Distance Teaching & Learning, virtual.
[29]
Griffin Dietz, Jimmy K Le, Nadin Tamer, Jenny Han, Hyowon Gweon, Elizabeth L Murnane, and James A. Landay. 2021. StoryCoder: Teaching Computational Thinking Concepts Through Storytelling in a Voice-Guided App for Children. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Article 54, 15 pages.
[30]
Griffin Dietz, Nadin Tamer, Carina Ly, Jimmy K Le, and James A. Landay. 2023. Visual StoryCoder: A Multimodal Programming Environment for Children’s Creation of Stories. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, Article 96, 16 pages.
[31]
Betsy DiSalvo and Carl DiSalvo. 2014. Designing for democracy in education: Participatory design and the learning sciences. In Learning and Becoming in Practice: The International Conference of the Learning Sciences (ICLS). Vol. 2. International Society of the Learning Sciences, Colorado, CO, 793–799.
[32]
Kevin Doherty and Gavin Doherty. 2018. Engagement in HCI: conception, theory and measurement. ACM Computing Surveys (CSUR) 51, 5 (2018), 1–39.
[33]
Stefania Druga, Fee Lia Christoph, and Amy J Ko. 2022. Family as a Third Space for AI Literacies: How do children and parents learn about AI together?. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, USA, Article 225, 17 pages.
[34]
Stefania Druga, Sarah T. Vu, Eesh Likhith, and Tammy Qiu. 2019. Inclusive AI literacy for kids around the world. In Proceedings of FabLearn 2019. Association for Computing Machinery, New York, NY, USA, 104–111.
[35]
Stefania Druga, Jason Yip, Michael Preston, and Devin Dillon. 2021. The 4As: Ask, Adapt, Author, Analyze.
[36]
Mandi Dupain and Loréal L Maguire. 2007. Health digital storytelling projects. American Journal of Health Education 38, 1 (2007), 41–43.
[37]
Mariana Leyton Escobar, Piet AM Kommers, and Ardion Beldad. 2014. Using narratives as tools for channeling participation in online communities. Computers in Human Behavior 37 (2014), 64–72.
[38]
Enrique Estellés-Arolas and Fernando González-Ladrón-de Guevara. 2012. Towards an integrated crowdsourcing definition. Journal of Information science 38, 2 (2012), 189–200.
[39]
Valerie Farnsworth. 2010. Conceptualizing identity, learning and social justice in community-based learning. Teaching and teacher education 26, 7 (2010), 1481–1489.
[40]
Rebecca Ferguson. 2012. Learning analytics: drivers, developments and challenges. International Journal of Technology Enhanced Learning 4, 5-6 (2012), 304–317.
[41]
Catherine Twomey Fosnot. 2013. Constructivism: Theory, perspectives, and practice. Teachers College Press, 1234 Amsterdam Ave. New York NY 10027.
[42]
Jennifer A Fredricks, Phyllis C Blumenfeld, and Alison H Paris. 2004. School engagement: Potential of the concept, state of the evidence. Review of educational research 74, 1 (2004), 59–109.
[43]
Gallup. 2016. Diversity gaps in computer science: exploring the underrepresentation of girls, Blacks and Hispanics.
[44]
Stefania Giannini. 2023. Generative AI and the future of education.
[45]
Manuela Glaser, Bärbel Garsoffky, and Stephan Schwan. 2009. Narrative-based learning: Possible benefits and problems.
[46]
Google. 2014. Women Who Choose Computer Science: what Really Matters: The Critical Role of Encouragement and Exposure.
[47]
Charles R Graham. 2011. Theoretical considerations for understanding technological pedagogical content knowledge (TPACK). Computers & Education 57, 3 (2011), 1953–1960.
[48]
Philip J Guo, Julia M Markel, and Xiong Zhang. 2020. Learnersourcing at scale to overcome expert blind spots for introductory programming: A three-year deployment study on the python tutor website. In Proceedings of the Seventh ACM Conference on Learning@ Scale. ACM, Virtual, 301–304.
[49]
Mark Guzdial. 2015. Learner-centered design of computing education: Research on computing for everyone. Morgan & Claypool Publishers, Kentfield, CA 94914.
[50]
Didik Hariyanto, Sigit Yatmono, Moh Khairudin, and Thomas Köhler. 2022. Students e-learning readiness towards education 4.0: Instrument development and validation. Jurnal Pendidikan Vokasi 12, 3 (2022).
[51]
Suzanne Hidi and K Ann Renninger. 2006. The four-phase model of interest development. Educational psychologist 41, 2 (2006), 111–127.
[52]
Min Hu and Hao Li. 2017. Student engagement in online learning: A review. In 2017 International Symposium on Educational Technology (ISET). IEEE, Hong Kong, 39–43.
[53]
Min-Ling Hung, Chien Chou, Chao-Hsiu Chen, and Zang-Yuan Own. 2010. Learner readiness for online learning: Scale development and student perceptions. Computers & Education 55, 3 (2010), 1080–1090.
[54]
Hawoong Jeong, Zoltan Néda, and Albert-László Barabási. 2003. Measuring preferential attachment in evolving networks. Europhysics letters 61, 4 (2003), 567.
[55]
Martin Johnson and Dominika Majewska. 2022. Formal, non-formal, and informal learning: What are they, and how can we research them?Journal of Education, Society & Multiculturalism 1, 2 (2022).
[56]
Hassan Khosravi, Gianluca Demartini, Shazia Sadiq, and Dragan Gasevic. 2021. Charting the design and analytics agenda of learnersourcing systems. In LAK21: 11th international learning analytics and knowledge conference, Vol. 11. Association for Computing Machinery, virtual, 32–42.
[57]
Hassan Khosravi, Paul Denny, Steven Moore, and John Stamper. 2023. Learnersourcing in the age of AI: Student, educator and machine partnerships for content creation.
[58]
Juho Kim 2015. Learnersourcing: improving learning with collective learner activity. Ph. D. Dissertation. Massachusetts Institute of Technology.
[59]
Aniket Kittur, Boris Smus, Susheel Khamkar, and Robert E Kraut. 2011. Crowdforge: Crowdsourcing complex work. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, USA, 43–52.
[60]
Matthew Koehler and Punya Mishra. 2009. What is technological pedagogical content knowledge (TPACK)?Contemporary issues in technology and teacher education 9, 1 (2009), 60–70.
[61]
Dimitra Kokotsaki, Victoria Menzies, and Andy Wiggins. 2016. Project-based learning: A review of the literature. Improving schools 19, 3 (2016), 267–277.
[62]
David A Kolb. 2014. Experiential learning: Experience as the source of learning and development. FT press, Upper Saddle River, NJ.
[63]
Maria Kordaki and Panagiotis Kakavas. 2017. Digital storytelling as an effective framework for the development of computational thinking skills. In Edulearn17 Proceedings. The International Academy of Technology, Education and Development (IATED), Pl. de la Legió Espanyola, 11, El Pla del Real, 46010 València, Valencia, Spain, 6325–6335.
[64]
Ash Sarah L. 2009. Generating, Deepening, and Documenting Learning: the Power of Critical Reflection in Applied Learning. Applied Learning in Higher Education 1 (2009), 25–48.
[65]
R Eric Landrum, Karen Brakke, and Maureen A McCarthy. 2019. The pedagogical power of storytelling.Scholarship of Teaching and Learning in Psychology 5, 3 (2019), 247.
[66]
Chan Jean Lee. 2019. The test taker’s fallacy: How students guess answers on multiple-choice tests. Journal of Behavioral Decision Making 32, 2 (2019), 140–151.
[67]
Raymond ST Lee, James NK Liu, Karo SY Yeung, Alan HL Sin, and Dennis TF Shum. 2009. Agent-based web content engagement time (wcet) analyzer on e-publication system. In 2009 Ninth International Conference on Intelligent Systems Design and Applications. IEEE, United States, 67–72.
[68]
Justin B Leibowitz, Charity Flener Lovitt, and Craig S Seager. 2020. Development and Validation of a Survey to Assess Belonging, Academic Engagement, and Self-Efficacy in STEM RLCs.Learning Communities: Research & Practice 8, 1 (2020), 3.
[69]
James C Lester, Hiller A Spires, John L Nietfeld, James Minogue, Bradford W Mott, and Eleni V Lobene. 2014. Designing game-based learning environments for elementary science education: A narrative-centered learning perspective. Information Sciences 264 (2014), 4–18.
[70]
Alex Lishinski and Joshua Rosenberg. 2021. All the pieces matter: The relationship of momentary self-efficacy and affective experiences with CS1 achievement and interest in computing. In Proceedings of the 17th ACM Conference on International Computing Education Research. ACM, Virtual, 252–265.
[71]
Breanne K Litts, Kristin A Searle, Bryan MJ Brayboy, and Yasmin B Kafai. 2021. Computing for all?: Examining critical biases in computational tools for learning. British Journal of Educational Technology 52, 2 (2021), 842–857.
[72]
Chung Kwan Lo. 2023. What is the impact of ChatGPT on education? A rapid review of the literature. Education Sciences 13, 4 (2023), 410.
[73]
Duri Long, Takeria Blunt, and Brian Magerko. 2021. Co-designing AI literacy exhibits for informal learning spaces. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2 (2021), 1–35.
[74]
Duri Long and Brian Magerko. 2020. What is AI literacy? Competencies and design considerations. In Proceedings of the 2020 CHI conference on human factors in computing systems. Association for Computing Machinery, USA, 1–16.
[75]
Khalid Mahmood. 2016. Do people overestimate their information literacy skills? A systematic review of empirical evidence on the Dunning-Kruger effect. Communications in Information Literacy 10, 2 (2016), 3.
[76]
Lina Markauskaite, Rebecca Marrone, Oleksandra Poquet, Simon Knight, Roberto Martinez-Maldonado, Sarah Howard, Jo Tondeur, Maarten De Laat, Simon Buckingham Shum, Dragan Gašević, 2022. Rethinking the entwinement between artificial intelligence and human learning: What capabilities do learners need for a world with AI?Computers and Education: Artificial Intelligence 3 (2022), 100056.
[77]
Allison Master, Sapna Cheryan, Adriana Moscatelli, and Andrew N Meltzoff. 2017. Programming experience promotes higher STEM motivation among first-grade girls. Journal of experimental child psychology 160 (2017), 92–106.
[78]
Scott W McQuiggan, Jennifer L Robison, and James C Lester. 2010. Affective transitions in narrative-centered learning environments. Journal of Educational Technology & Society 13, 1 (2010), 40–53.
[79]
Scott W McQuiggan, Jonathan P Rowe, Sunyoung Lee, and James C Lester. 2008. Story-based learning: The impact of narrative on learning experiences and outcomes. In Intelligent Tutoring Systems: 9th International Conference. Springer, Montreal, Canada, 530–539.
[80]
M Millians. 2011. Learning readiness.
[81]
Marta Montenegro-Rueda, José Fernández-Cerero, José María Fernández-Batanero, and Eloy López-Meneses. 2023. Impact of the implementation of ChatGPT in education: A systematic review. Computers 12, 8 (2023), 153.
[82]
Anna Moutafidou and Tharrenos Bratitsis. 2018. Digital storytelling: Giving voice to socially excluded people in various contexts. In Proceedings of the 8th international conference on software development and technologies for enhancing accessibility and fighting info-exclusion. ACM, New York, 219–226.
[83]
Kasia Muldner, Winslow Burleson, and Kurt VanLehn. 2010. “Yes!”: Using tutor and sensor data to predict moments of delight during instructional activities. In User Modeling, Adaptation, and Personalization: 18th International Conference. Springer, Big Island, HI, USA, 159–170.
[84]
Mahin Naderifar, Hamideh Goli, and Fereshteh Ghaljaie. 2017. Snowball sampling: A purposeful method of sampling in qualitative research. Strides in development of medical education 14, 3 (2017).
[85]
Davy Tsz Kit Ng, Jac Ka Lok Leung, Samuel Kai Wah Chu, and Maggie Shen Qiao. 2021. Conceptualizing AI literacy: An exploratory review. Computers and Education: Artificial Intelligence 2 (2021), 100041.
[86]
Davy Tsz Kit Ng, Jac Ka Lok Leung, Maggie Jiahong Su, Iris Heung Yue Yim, Maggie Shen Qiao, and Samuel Kai Wah Chu. 2023. AI literacy in K-16 classrooms. Springer Cham, Gewerbestrasse 11, 6330 Cham, Switzerland.
[87]
Davy Tsz Kit Ng, Wanying Luo, Helen Man Yi Chan, and Samuel Kai Wah Chu. 2022. Using digital story writing as a pedagogy to develop AI literacy among primary students. Computers and Education: Artificial Intelligence 3 (2022), 100054.
[88]
Hannele Niemi. 2002. Active learning—a cultural change needed in teacher education and schools. Teaching and teacher education 18, 7 (2002), 763–780.
[89]
Arnel B Ocay. 2019. Investigating the Dunning-Kruger effect among students within the contexts of a narrative-centered game-based learning environment. In Proceedings of the 2019 2nd International Conference on Education Technology Management. ACM, New York, 8–13.
[90]
David L Olson and Kirsten Rosacker. 2013. Crowdsourcing and open source software participation. Service Business 7 (2013), 499–511.
[91]
Heather L O’Brien, Paul Cairns, and Mark Hall. 2018. A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form. International Journal of Human-Computer Studies 112 (2018), 28–39.
[92]
Zachary A Pardos, Ioannis Anastasopoulos, and Shreya K Sheel. 2023. Conducting Rapid Experimentation with an Open-Source Adaptive Tutoring System. In International Conference on Artificial Intelligence in Education. Springer, Springer, Germany, 38–43.
[93]
E Martin Pedersen. 1995. Storytelling and the art of teaching. In English Teaching Forum, Vol. 33. English Teaching Forum, Washington, D.C., 2–5.
[94]
Nancy E Perry. 1998. Young children’s self-regulated learning and contexts that support it.Journal of educational psychology 90, 4 (1998), 715.
[95]
Paul R Pintrich 1991. A manual for the use of the Motivated Strategies for Learning Questionnaire (MSLQ).
[96]
Nea Pirttinen, Paul Denny, Arto Hellas, and Juho Leinonen. 2023. Lessons Learned From Four Computing Education Crowdsourcing Systems. IEEE Access 11 (2023), 22982–22992.
[97]
Stefan Popenici. 2022. Artificial Intelligence and Learning Futures: Critical Narratives of Technology and Imagination in Higher Education. Taylor & Francis, United Kingdom.
[98]
Deborah M Price, Linda Strodtman, Elizabeth Brough, Steven Lonn, and Airong Luo. 2015. Digital storytelling: an innovative technological approach to nursing education. Nurse educator 40, 2 (2015), 66–70.
[99]
Isaac Prilleltensky, Geoffrey Nelson, and Leslea Peirson. 2001. The role of power and control in children’s lives: An ecological analysis of pathways toward wellness, resilience and problems. Journal of Community & Applied Social Psychology 11, 2 (2001), 143–158.
[100]
Panagiotis Psomos and Maria Kordaki. 2012. Pedagogical analysis of educational digital storytelling environments of the last five years. Procedia-Social and Behavioral Sciences 46 (2012), 1213–1218.
[101]
Chia Yi Quah and Kher Hui Ng. 2022. A systematic literature review on digital storytelling authoring tool in education: January 2010 to January 2020. International Journal of Human–Computer Interaction 38, 9 (2022), 851–867.
[102]
Randstad. 2023. The Workmonitor.
[103]
Judy Robertson and Maurits Kaptein. 2016. An introduction to modern statistical methods in HCI. Springer, Germany.
[104]
Bernard Robin. 2006. The educational uses of digital storytelling. In Society for information technology & teacher education international conference. Association for the Advancement of Computing in Education (AACE), Association for the Advancement of Computing in Education (AACE), Asheville, NC, 709–716.
[105]
Eka Roivainen. 2011. Gender differences in processing speed: A review of recent research. Learning and Individual differences 21, 2 (2011), 145–149.
[106]
Joshua M Rosenberg and Matthew J Koehler. 2015. Context and technological pedagogical content knowledge (TPACK): A systematic review. Journal of Research on Technology in Education 47, 3 (2015), 186–210.
[107]
Sherry Ruan, Jiayu He, Rui Ying, Jonathan Burkle, Dunia Hakim, Anna Wang, Yufeng Yin, Lily Zhou, Qianyao Xu, Abdallah AbuHashem, 2020. Supporting children’s math learning with feedback-augmented narrative technology. In Proceedings of the interaction design and children conference. ACM, USA, 567–580.
[108]
Alaa Sadik. 2008. Digital storytelling: A meaningful technology-integrated approach for engaged student learning. Educational technology research and development 56 (2008), 487–506.
[109]
Hatice Çıralı Sarıca and Yasemin Koçak Usluel. 2016. The effect of digital storytelling on visual memory and writing skills. Computers & Education 94 (2016), 298–309.
[110]
John R Savery. 2015. Overview of problem-based learning: Definitions and distinctions. Essential readings in problem-based learning: Exploring and extending the legacy of Howard S. Barrows 9, 2 (2015), 5–15.
[111]
Nalin K Sharda. 2007. Authoring educational multimedia content using learning styles and story telling principles. Proceedings of the international workshop on Educational multimedia and multimedia education 15 (2007), 93–102.
[112]
Susan W Sherman. 1976. Multiple Choice Test Bias Uncovered by Use of an" I Don’t Know" Alternative.
[113]
Anjali Singh, Christopher Brooks, and Shayan Doroudi. 2022. Learnersourcing in theory and practice: synthesizing the literature and charting the future. In Proceedings of the Ninth ACM Conference On Learning@ Scale. Association for Computing Machinery, New York, NY, United States, 234–245.
[114]
Anjali Singh, Christopher Brooks, Xu Wang, Warren Li, Juho Kim, and Deepti Pandey. 2023. Bridging Learnersourcing and AI: Exploring the Dynamics of Student-AI Collaborative Feedback Generation.
[115]
Najat Smeda, Eva Dakich, and Nalin Sharda. 2010. Developing a framework for advancing e-learning through digital storytelling. In IADIS International Conference e-learning. IADIS, Freiburg, Germany, 169–176.
[116]
Jiahong Su, Davy Tsz Kit Ng, and Samuel Kai Wah Chu. 2023. Artificial intelligence (AI) literacy in early childhood education: The challenges and opportunities. Computers and Education: Artificial Intelligence 4 (2023), 100124.
[117]
Sangho Suh, Martinet Lee, Gracie Xia, 2020. Coding strip: A pedagogical tool for teaching and learning programming concepts through comics. In Visual Languages and Human-Centric Computing. IEEE, New Zealand, 1–10.
[118]
Ruth Sylvester and Wendy-lou Greenidge. 2009. Digital storytelling: Extending the potential for struggling writers. The reading teacher 63, 4 (2009), 284–295.
[119]
Joanna Szurmak and Mindy Thuna. 2013. Tell me a story: The use of narrative as tool for instruction. Educational Media International 54, 1 (2013), 20–33.
[120]
Karin Tengler, Oliver Kastner-Hauler, and Barbara Sabitzer. 2021. Enhancing Computational Thinking Skills using Robots and Digital Storytelling. In CSEDU, Vol. 1. Proceedings of the 13th International Conference on Computer Supported Education, Virtual, 157–164.
[121]
Danielle R Thomas, Shivang Gupta, and Kenneth R Koedinger. 2023. Comparative analysis of learnersourced human-graded and ai-generated responses for autograding online tutor lessons. In International Conference on Artificial Intelligence in Education. Springer, Tokyo, Japan, 714–719.
[122]
David Touretzky, Christina Gardner-McCune, Fred Martin, and Deborah Seehorn. 2019. Envisioning AI for K-12: What should every child know about AI?Proceedings of the AAAI conference on artificial intelligence 33, 01 (2019), 9795–9799.
[123]
Bernie Trilling and Charles Fadel. 2009. 21st century skills: Learning for life in our times. John Wiley & Sons, Hoboken, NJ.
[124]
Ester Van Laar, Alexander JAM Van Deursen, Jan AGM Van Dijk, and Jos De Haan. 2017. The relation between 21st-century skills and digital skills: A systematic literature review. Computers in human behavior 72 (2017), 577–588.
[125]
Viswanath Venkatesh, James YL Thong, and Xin Xu. 2012. Consumer acceptance and use of information technology: extending the unified theory of acceptance and use of technology. MIS quarterly 36, 1 (2012), 157–178.
[126]
Marianna Vivitsou and Hannele Niemi. 2017. 21st Century Digital Storytelling in Education - Developing Stories with Digital Technologies. In New Ways to Teach and to Learn in China and Finland. CICERO Symposium, Helsinki, Finland.
[127]
Jakob Voss. 2005. Measuring wikipedia. In Proceedings of The International Conference of the International Society for Scientometrics and Informetrics. International Society for Scientometrics and Informetrics, Stockholm, Sweden.
[128]
Gregory M Walton and Geoffrey L Cohen. 2007. A question of belonging: race, social fit, and achievement.Journal of personality and social psychology 92, 1 (2007), 82.
[129]
Tianchong Wang and Eric CK Cheng. 2021. Towards a tripartite research agenda: a scoping review of artificial intelligence in education research. In Proceedings of the International Conference on Artificial Intelligence in Education Technology. Springer, The International Conference on Artificial Intelligence in Education Technology, Wuhan, China, 3–24.
[130]
Kerri Wazny. 2017. “Crowdsourcing” ten years in: A review. Journal of global health 7, 2 (2017).
[131]
Jeremy Weinstein, Rob Reich, and Mehran Sahami. 2021. System error: Where big tech went wrong and how we can reboot. Harper, UK.
[132]
Sarah Weir, Juho Kim, Krzysztof Z Gajos, and Robert C Miller. 2015. Learnersourcing subgoal labels for how-to videos. In Proceedings of the 18th ACM conference on computer supported cooperative work & social computing. Association for Computing Machinery (ACM), Vancouver BC Canada, 405–416.
[133]
Eunike Wetzel, Jan R Böhnke, and Anna Brown. 2016. The ITC International Handbook of Testing and Assessment. Oxford University Press, New York. 349––363 pages.
[134]
Jeffrey L Whitten, Lonnie D Bentley, and Thomas IM Ho. 1986. Systems analysis & design methods. Times Mirror/Mosby College Publishing, Portland, OR.
[135]
Vicki Williams. 2001. Online Learning Readiness Questionnaire.
[136]
Scott D Wurdinger and Julie A Carlson. 2009. Teaching for experiential learning: Five approaches that work. R&L Education, Lanham, MD.
[137]
Ya-Ting C Yang and Wan-Chi I Wu. 2012. Digital storytelling for enhancing student academic achievement, critical thinking, and learning motivation: A year-long experimental study. Computers & education 59, 2 (2012), 339–352.
[138]
NP Yılmaz. 2017. Learning experiences of prospective teachers in a digital storytelling tool called “Toondoo”: a case study. Current Trends in Educational Sciences içinde 6 (2017), 447–454.
[139]
Pelin Yuksel-Arslan, Soner Yildirim, and Bernard Ross Robin. 2016. A phenomenological study: teachers’ experiences of using digital storytelling in early childhood education. Educational Studies 42, 5 (2016), 427–445.
[140]
Xiaofei Zhou, Jessica Van Brummelen, and Phoebe Lin. 2020. Designing AI learning experiences for K-12: Emerging works, future opportunities and a design framework.

Cited By

View all
  • (2025)Beyond the Hype: A Comprehensive Review of Current Trends in Generative AI Research, Teaching Practices, and Tools2024 Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3689187.3709614(300-338)Online publication date: 22-Jan-2025

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems
May 2024
18961 pages
ISBN:9798400703300
DOI:10.1145/3613904
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2024

Check for updates

Badges

Author Tags

  1. AI literacy
  2. STEM education
  3. collaborative learning
  4. digital narratives
  5. learnersourcing
  6. online learning tools
  7. storytelling

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CHI '24

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,686
  • Downloads (Last 6 weeks)402
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Beyond the Hype: A Comprehensive Review of Current Trends in Generative AI Research, Teaching Practices, and Tools2024 Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3689187.3709614(300-338)Online publication date: 22-Jan-2025

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media