A.1 Preliminary phases of system design
To iteratively design, implement, and gather feedback on our system, we undertook three case studies with learners in a variety of extracurricular programming formats offered by TUMO. Our first case study focused on exploring learnersourced creation of narratives, the second focused on supporting collaborative engagement, and the third focused on scalability and the potential for learnersourced narratives to be effective and engaging across different cultural contexts. Identifiers for case study participants use the format Cx-y (e.g., C1-1, C1-2, C1-3, C2-1, etc.) — see the table in
A.8 for each participant’s ID, age, gender, and case study number.
A.1.1 Case study 1 (C1): Creating an initial prototype, developing narrative-based learning content, and exploring learnersourcing-based scaffolds.
With our first case study, we aimed to build out our initial prototype, develop narrative-based learning content, and explore what sorts of scaffolds were needed to help sufficiently guide learners with little to no prior programming or AI knowledge through a successful learnersourcing experience.
C1 was conducted as an online, 3-week workshop in partnership with TUMO, who assisted in recruiting N=18 learners who ranged in age from 12–17 (9 female, 9 male). All participants had little to no prior coding experience. The first author served as a live facilitator, who developed lesson plans about fundamental AI concepts including machine learning (supervised, unsupervised, and reinforcement algorithms), classification and prediction, model training, testing, internal representations, neural networks, feature sets, and bias. For each concept, a short lecture, educational handout, and activity were created. In prior years, the first author taught multiple other workshops on AI with TUMO and drew on this familiarity and past content. The second author assisted in facilitation and drew on her professional experiences with these AI topics.
The workshop began with an overview of narrative structure, storytelling techniques, and choose-your-own-adventure style games, where learners explored how a linear narrative could be translated into a graph structure with multiple different possible endings. Using this notion of a “story graph” to characterize a narrative as a set of nodes (story content) and edges (transitions among these story scenes and plot points), the authors introduced students to concepts of graph structure, first in plain language and then in terms of code.
The remainder of the workshop focused on participants devising their own stories to convey AI concepts. Specifically, the first author would deliver a lecture on an AI concept and utilize the associated educational handout and activities. Participants were instructed to complete worksheets created by the authors to develop character profiles and map out plot arcs that weaved in the AI concept. This approach was based on Movement Oriented Design, which offers principles for developing educational multimedia narratives that are emotionally engaging and high quality [
111]. As an example, after the session on k-means clustering, C1-9 wrote a story where a wizard challenges the main character to sort the items in a massive bag. In another story, a tourist photographs a series of statues during a day of sightseeing and writes their names on the back. To demonstrate concepts related to training data and classification, the story then follows the character as she visits a new part of town with different statues of the same historical figures and tries to guess their names based on her photographs.
Next, participants were tasked with implementing their narrative content as interactive functionality using our initial system prototype. We saw that participants, and particularly those with little prior coding experience, benefited from creating narrative-based content in stages (e.g., graph level summary, then wiki, then code), which resonates with literature on digital storytelling in education [
126]. Participants were enthusiastic about the interactive stories they had made and displayed genuine delight with employing stories as a vehicle for learning. Notably, they actively shared the platform with their friends and family out of enthusiasm to show their work to others.
Regarding design implications for the narrative specifically, we found participants perceived the stories that blended fiction and personal experiences as the most interesting. Narratives that featured local contexts, such as places in or people from Yerevan were especially exciting to them. We did observe that participants often struggled to generate story content from scratch, so having prompts that invited them to source stories from everyday experiences was helpful. Further, participants appreciated that the story aspects of the learning experience pushed them in directions where they may have originally lacked self-confidence. That is, we observed that a number of learners who rated themselves at the start of the study as strong technically but weak in terms of storytelling appreciated the story aspects by the end of the study, and vice versa for students originally confident in storytelling but feeling less sure of their technical capabilities. Such findings reinforce ideas that narratives are an accessible, approachable vehicle for learning, including about concepts where learners initially feel intimidated.
After C1, our key open questions related to how learners might effectively collaborate and interact with one another on the platform, and whether it is viable for such a platform to enable not only virtual engagement from distributed learners but also in-person engagement from co-located sets of learners. We therefore undertook our next case study to explore these questions in preparation for our final evaluation.
A.1.2 Case study 2 (C2): Refining designs to provide opportunities for learners to collaborate and interact on the platform as well as engage in-person.
For C2, we recruited N=31 learners learners who ranged in age from 15–19 (21 female, 10 male). All participants had little to no prior coding experience. While C1 helped confirm that narrative-based learning content can be exciting for learners to utilize and create, it also exposed a need for additional scaffolding to help these learners better collaborate on these efforts. C2 therefore focused on these interpersonal considerations, along with whether our approach could remain inclusive to diverse extracurricular learning setups, particularly those with in-person components. Specifically, C2 was a 3-week workshop held in-person in Yerevan, Armenia and conducted in partnership with TUMO, who assisted in recruiting.
Before starting the workshop, we improved support for collaboration on our prototype through better ticket tracking and documentation in the story graph component (mimicking features and processes from popular Agile development tools like Jira). For instance, to manage synchronization between the wiki and codebase, we added a procedure for ownership tagging to the guidebook.
We found that participants were able to utilize all of these features with negligible confusion, which could be easily resolved by the facilitator or a peer. Further, we observed that learners appreciated having structured, systematic processes to follow, and that these processes did not inhibit their creativity. The need for content moderation did come up several times during the workshop, such as when participants added age-inappropriate language or other content. We approached this issue by denoting two volunteers who had reached the role of facilitator to be content moderators, charging them with reading through all updates to the wiki at the end of each day and bringing to our attention anything that they deemed problematic. If we requested changes, the moderators would create tickets accordingly; such roles could foreseeably be baked into the platform. In general, we responded to emerging issues like these by building out processes on the platform for responsibilities that learners could take up themselves, rather than by directly intervening such as changing content ourselves.
A.1.3 Case study 3 (C3): Investigating the scalability of the platform, particularly across cultural contexts, and whether students can build on each other’s contributions even over shorter engagement timeframes.
For the third case study (C3) we recruited N=16 learners learners who ranged in age from 14–17 (8 female, 8 male). All had little to no prior coding experience. C3 focused on whether users could in fact build on each other’s contributions to continue expanding out and sustaining the platform, including when learners were from different cultural contexts. The fact that C2 took place in Yerevan, Armenia and that all C1 and C2 participants identified as Armenian shaped those workshops and the resulting content. Students centered their Armenian identity within the created narratives, including cultural history and points of Armenian pride.
Given this embedding of cultural identity into learners’ created content, C3 explored scalability to a different learner demographic, from a different geographical region, with potentially minimal cultural common ground. We were particularly motivated to examine this question given expressions of enthusiasm from C1 and C2 participants in not only developing content with cultural significance but also for that content to be broadly shared. For example, C2-8 shared that in her favorite part of one story, the plot described Armenians as being known as “warm and hospitable people” and how that made her feel good knowing that “people from other countries will read about it.” She emphasized that this “made the project more exciting.” C2-2 felt strongly that she didn’t “want Armenia to only be associated with Kim Kardashian” but instead she wanted “it to be associated with the history, and the fun stuff too.” A few participants expressed that they were motivated by the cross-cultural possibilities of sharing the project with learners from other countries; C2-29 remarked it could be interesting for those learners who “might not get another opportunity to learn about Armenia.”
Further, while C1 and C2 lasted 3 weeks, we wanted to understand if learning benefits could be observed after a more brief experience on the platform. C3 was therefore run as a 1-week workshop in Berlin, Germany. Through an onboarding survey, we verified there was very little cultural awareness or connection to Armenia for C3 participants, despite the mutual affiliation with TUMO.
This workshop began with a review of the existing prototype. Specifically, after C3 participants engaged with the platform, they wrote what they felt worked well and not well on sticky notes, which we then grouped using affinity diagramming. From resulting clusters, we created a list of aspects to keep as-is, along with suggested additions and other changes, which we then collaboratively ranked by importance. We had conducted the same exercise at the end of C2 (i.e., on the same prototype that C3 participants assessed), and our comparison of the recommendations from the C2 and C3 participants revealed few differences. Both groups focused on improvements related to the desire for style consistency across narratives, more options on the wiki for architecting out story structure, and better ways to integrate teaching lessons into the narrative content. These results were encouraging, as they demonstrated a consistency between the two groups of learners.
Regarding the narrative, we were additionally pleased to see that the local cultural references were well-received by the C3 learners. In interviews, these participants reported that they were excited to learn more about and participate in stories about Armenia, with this content seen as “more intentional” and “less random”. These findings indicate that drawing stories from real-world, localized, and cultural experiences tended to result in more compelling narrative content, even for learners from outside those regions and cultures.
In C3, learners’ baseline exposure to computing was much lower compared to C1 and C2. In this group of participants, most had never taken a computer science course and found the wiki and its associated processes much more intuitive than working directly in the codebase. That said, within two days, participants were able to contribute to the content and began addressing some of the areas of improvement they had identified in the initial review. Given the C3’s condensed timeframe, we observed that learners focused on augmenting existing narrative content (e.g., fixing plot holes, making stylistic improvements, and developing small spin-off stories), demonstrating their ability to build on other users’ contributions and support a cycle of improvement and growth on the platform.
A.2 Technical details of the narrative-based learnersourcing platform
As mentioned, our system consists of three main components: the story adventure, the story graph, and the story infrastructure. Here we provide technical details about these components.
A.2.1 Story adventure component.
The story adventure component provides a web application to enable a broad base of learners to engage with story content. Specifically, the component supports state-dependent concurrent sessions, a chatbot-esque choose-your-own-adventure story experience (where choices can be either free response text or multiple choice), progress tracking, and an animated 3-dimensional graph-based story navigation system. These capabilities are enabled by three web views: an "adventure view", "graph view", and "progress view". A discussion of the backend can be found in
A.2.2 on the story graph component.
Adventure view supports an infinite scroll of multimedia, links, and text messages with an interactive display that supports input via multiple choice selections and free text responses. Graph view displays each story scene with a representative picture (e.g., a church, if the scene relates to visiting a church) floating in three-dimensional space (projected using a force-graph calculation). The pictures are connected according to parent-child relationships between the scenes. A learner can navigate the interface with standard, mouse-based zoom, rotate, and translate controls (e.g., similar to Google Earth). The learner can select a scene by clicking on its picture to see an interactive pop-up summary. Using this summary, the learner can return to scenes in adventure view (provided they have already completed the scene). The progress view displays information about the learner’s learning progress. This includes progress toward completing level requirements and statistics related to story scene creation and consumption.
A.2.2 Story graph component.
The story graph component has two objectives: supporting the story adventure and enabling new content creation. The first is accomplished via an exposed API from the story adventure frontend views. Powering this API is a Node.js server and Firebase database that we collectively refer to as “the storyteller.” The storyteller is a finite-state-automaton that maintains a graph representation of story content and a history of every learner’s “story state” as determined by his or her story interactions. It uses these, when provided with a learner’s latest interaction, to respond to “scene continuation” API requests. The second objective is a balancing act, as it involves making the barrier to content creation low for novice learners while supporting advanced learners’ technical creative expression. This is accomplished by dividing the content creation process into three discrete stages, each involving a progressively lower-level tool; these are the “graph view create GUI,” the “story wiki”, and the "storyteller content graph.”
The "graph view creates GUI" is where learners initiate the process of creating or modifying a scene. This entry point is accessible to learners who have just completed level 1 because it shares the same web interface (as an unlockable “architect mode”). The GUI flow asks users to fill out a template describing the change they are making, then directs them to the guidebook (discussed next in
A.2.3) for guidance on creating content and using the story wiki and the storyteller content graph.
The story wiki is a collection of Google Docs, where each doc corresponds to a particular scene, used for drafting and testing story content. Each doc is drafted via two steps. First, a scene is drafted at a conceptual level, using scene and character archetype templates. This is provided to ChatGPT as a contextual pre-prompt, using a process outlined in the guidebook. The learner is then guided through fleshing out the template into a non-linear story script using exchanges with ChatGPT. Steps in prompt formulation are structured (e.g., the GPT-3 and GPT-4 models both included AI4ALL and "5 Big Ideas in AI" in their training data, so targeted references to this content by name generally results in reasonably well-crafted responses), though critical thinking is still required to check facts and keep narratives consistent. As story content is created, it is represented as nodes (using bullet points) with further indented multiple choice options and rules-based free response interactions representing the edges as hyperlinks. Internal links to a particular node on a page are created with header refs. More complex interactions such as updates to state variables, mini-games, or real-world activities are simply described in plain text.
The storyteller content graph is where content is implemented as code. Once a content change has been staged on the wiki, a first-time architect is directed to clone the storyteller Node.js from the GitHub repository and run the server locally to test her changes. This is facilitated by the guidebook (via a video tutorial). While popular closed- and open-source options exist for finite-state-automaton, we chose to code the server from scratch to meet the particular integration demands and desired capabilities of our system. Specifically, our codebase compartmentalizes functionality into three modules: 1) the core state-based automata functionality, 2) state management, and 3) story content. The complete abstraction of this complexity via library functions designed as wrappers for our wiki content was the key to helping learners with little to no prior coding experience write code. Using these custom library functions, moving content from the wiki to the codebase became simple. We repeatedly observed that setting up VSCode was often the most challenging aspect of architect onboarding for complete novices. In particular, TypeScript Types helped learners identify and debug errors in their content formatting pre-compilation. A custom set of pre-compilation tests we wrote (e.g., to ensure that every edge actually links to an existent node) was also useful for learners. For novice learners advancing to levels 3 and 4, our library provided simple approaches to the requirements of mini-game embedding and real-world engagement. For more advanced learners, representing story content as data objects wrapped in code (rather than as data in Firebase) had an advantage in allowing exposure to complexity on demand. For instance, the library enabled more advanced learners to access the full functionality of our finite-state-automaton (e.g., to run NLP APIs on free text input using user state variables as a reference) using native TypeScript with full Type support.
A.2.3 Story infrastructure component.
The story infrastructure component is the underlying codebase and learner-driven design and maintenance processes that enable architects to create content that meets the needs of explorers. This component’s primary features are the frontend codebase, storyteller state management and core functionality, analytics view, and guidebook. We focus here on the analytics view and guidebook.
Analytics view is a page in the web app that becomes accessible once a learner completes level 4. It contains a dashboard with summary statistics about the community’s interaction activity, feedback, and quiz and assessment results on each story scene. Though we did not actually enable it for learners for privacy reasons, we do support tracking learners’ low-level actions. Our dashboard can display this data in aggregate assessments and time series views. To log events, we used popular Node packages like IdleTimer.
The guidebook is a Google Doc and accompanying lecture series we created involving six videos. The guide is organized into sections describing architect and facilitator roles in terms of objectives, processes, external resources, and interfaces that should be utilized to meet these goals. Any learner can suggest edits to add content to the guidebook or correct errors.