1 Introduction
The appearance of free-to-use generative Artificial Intelligence (GenAI) tools such as ChatGPT, Microsoft Copilot or Dall-E, has disrupted the way humans create new contents in multiple sectors (
e.g., programming [
5], music composition [
8]). GenAI tools can be defined as those systems that generate or adapt human-like content (
e.g., text, images, code) based on deep language models (
e.g., OpenAI) in response to prompts (
e.g., questions, instructions) [
10]. Education regards one of the sectors where GenAI starts to have strong impact, being used by both teachers and students with different purposes. Students use GenAI tools as a 24h teacher, as they can provide information on a given topic and personalize their own learning path [
2,
4]. Similarly, teachers use these systems to support task generation, improve the learning design, or augment their role [
4]. A recent study conducted with 100 teachers and 1,000 tertiary students revealed that 82% teachers are aware of ChatGPT and more than 89% students used it to help with a home assignment, thus confirming that GenAI is changing the traditional educational models [
15].
Yet, and despite its potential, the use of GenAI for learning purposes is accompanied by several concerns. For instance, as these systems can provide human-like content generation, students might offload their assignments. Additionally, their reliability is compromised, resulting sometimes in false responses (aka. hallucinations) or in copies from existing information without the corresponding citation (
i.e., plagiarism) [
2,
11]. Another relevant concern regards teacher autonomy. Teacher autonomy is defined as “the teachers’ willingness, capacity and freedom to take control of their own teaching and learning” [
9]. A high sense of autonomy is necessary although not sufficient for effective teaching [
17]. Additionally, fostering teacher autonomy is of great importance, since GenAI lacks emotional and interpersonal skills and pedagogical knowledge (
e.g., learning objectives), aspects attributed to teachers’ expertise [
7]. In this regard, when students use GenAI tools within a learning situation, teachers cannot be aware of students’ prompts, the given answers, their adequateness with the contents though in the classroom or whether the classroom submissions are copy-pasted from such answers. The aforementioned study revealed that 72% of the surveyed teachers are concerned about the impact of ChatGPT on cheating [
15]. Therefore, it seems relevant to explore the following research question:
How to increase teacher autonomy in learning situations where students might use GenAI tools?One potential solution to increase such teachers’ autonomy is the development of a middleware between the students’ interface and the GenAI tools so that all the information passing through this software is configurable and accessible to teachers. Middleware software between users and GenAI tools is already being used for different purposes such as chatbot interactions or to extend their functionality (
e.g., ChatGPT Next Web
1). However, to the best of our knowledge, there is not a dedicated middleware for teaching purposes, thus providing educators with a tool to increase their autonomy when using GenAI tools in their learning situations. Accordingly, this is our proposal to answer the posed research question.
The rest of the paper is as follows. Section
2 presents the theoretical background of teacher autonomy with GenAI systems. Next, Section
3 introduces the developed system to support teacher autonomy within learning situations that might involve GenAI systems. Section
4 describes the methodology followed to evaluate a first prototype of the system. Then, the results of this study are presented in Section
5. Finally, the main conclusions are outlined and discussed in Section
6.
2 Teacher Autonomy with Intelligent Systems
Although literature describes teacher autonomy in multiple forms, one of the most common definitions is the one provided by Huang (2005) [
9]: “the willingness, capacity and freedom to take control of their own teaching and learning”. Teacher autonomy is therefore conceived as a subjective personal perception to execute the necessary actions and exert control over different teaching domains such as students’ assessment, teaching methods and/or curriculum development [
17]. Teacher autonomy within GenAI context is necessary for effective and contextualized teaching as the learning aspects attributed to teachers’ expertise are lacking in GenAI content generation (
e.g., interpersonal skills, learning objectives, emotions) [
7].
In technology-enhanced leaning situations, and especially, when using intelligent systems, such as those involving GenAI, teacher autonomy can be compromised, as teachers need to adapt to the features, capabilities and automatic reactions of these tools. Therefore, teachers’ capacity and freedom to take control over different teaching scenarios is limited. Recent research claimed for a synergy between teachers and AI systems to provide informed learning-oriented decisions, thus permitting teachers to be agentic when monitoring, evaluating or enhancing the recommendations given by AI [
3,
7,
12].
Given this context, the Human-AI Automation Model proposed by Molenaar [
12] (see Figure
1) can be used to describe the different positions of teacher autonomy within intelligent systems. This model frames the transition of control between teachers and intelligent systems into six sequential levels. These levels range from ’teacher has full autonomy’ (left), to ’technology has full autonomy’ (right) within a learning situation, passing though different levels of autonomy where, for example, the teacher can monitor the students’ behaviours, be alerted to take the control back or select, approve and design reactions for the students.
According to this framework, when students use GenAI tools (disregarding whether the teacher supports its use or not), technology supports the sixth level of the automation framework,
i.e., ‘Full AI Automation’. In this situation, teachers are not aware of the students’ prompts, their appropriateness to solve the students’ questions, nor the automatic answers provided by the system which, apart from the risks already mentioned before (
e.g., offloading, reliability, plagiarism), might not be aware of students’ context and might not align with the pedagogical intentions or approaches of the teacher [
3].
For instance, students might prompt GenAI tools with questions about a given topic, receiving answers with excessive information for their current level, much more of what is expected to be covered in the curriculum. However, this situation would not happen if the question is directed to the course teacher. Similarly, students might use GenAI tools to solve the exercises requested by the teacher in order to practice with the contents learnt in the classroom (e.g., write an essay in a foreign language). In this case, students might use the answer provided by the GenAI tool without reflecting on it (nor practising). Once again, if this question is directed to the teacher, the teacher would provide suggestions or ideas on how to success with this task, fostering students’ own critical thinking and avoiding students’ offloading on GenAI models.
In summary, it seems relevant that teachers can place themselves at their desired automation level when using GenAI tools, thus supporting their perceived sense of autonomy. As mentioned before, our proposal to address this need is the development of a middleware that supports the different levels of the Human-AI automation model as presented in the next section.
3 GenAI System
The development of the envisioned system is currently undergoing and follows the Systems Development Research Methodology [
13]. This methodology includes elements from the behavioral and engineering areas, guiding the research process through a set of sequential steps (see Figure
2). The iterative nature of this methodology and the development of a system intended for humans make this methodology suitable for this research. This paper reports the first full iteration of this methodology in which we constructed a conceptual framework,
e.g., collected system requirements (see Section
3.1), developed a system architecture (see Section
3.2), analysed and designed the system (designed the data model), built the system (see Section
3.3), and observed and evaluated it with potential users,
i.e., teachers (see Section
4 and on).
3.1 Design Requirements
The initial design requirements of the system were derived from the different levels of Molenaar’s human-AI automation model
2. Leaving aside the extremes of the previous model, we can derive the following design requirements related to teacher autonomy in learning situations involving GenAI tools:
DR1.
Monitor students’ use. Adapting from the second and third levels of the model, the teacher has control of the learning situation and the system provides supportive information about the students’ interaction with the GenAI tools such as the prompts made by the students and/or the answers provided by the GenAI tool.
DR2.
Alert teachers to take action. Adapting from the fourth level of the model, the teacher monitors incidentally the learning situation, and the technology signals when teacher control is needed. Given this context, the system should alert teachers when attention is required (e.g., a topic is recurrently prompted to the GenAI tool by one or several students, a student submission is similar to the answer provided by the GenAI system).
DR3.
Automatic reactions through previous configuration. Adapting from the fifth level of the model, technology controls most tasks automatically (with a previous teacher configuration according to the educational purposes of the learning situation). For instance, the teacher might configure how the GenAI system should reply to a specific type of questions (e.g., provide guidance instead of giving the answer to a student question); adapting students’ queries (e.g., adding ‘give an answer for a 15-year old student’ at the end of the prompts, i.e., prompt add-ons), or increase the percentage of hallucinations to force students double check the answers given by the GenAI tools.
Additionally, we deem relevant the system to be usable (DR4). To this end, we consider that the system should be integrated within learning management systems (LMS) using a single sign-on process, thus avoiding registering in a new tool and being accessible from the regular LMS used by the teachers. Additionally, as different teachers might use different LMSs (e.g., Moodle, Canvas) and GenAI tools (e.g., ChatGPT, Elicit), it would be desirable the system to be multi-platform.
3.2 Architecture
The proposed architecture is depicted in Figure
3. As mentioned before, the system is expected to be embedded within different LMSs integrating a single sing-on authentication, thus supporting its usability and adoption. To this end, the architecture integrates the Learning Tools Interoperability (LTI) specification
3, allowing the provision of different interfaces (and functionalities) for teachers and students.
Within the teacher interface, teachers can configure in the system which GenAI tool use, how the Middleware box will behave in response to students’ prompts (e.g., prompt add-ons), and to GenAI replies (e.g., forced hallucinations). Additionally, the interface will display the students’ interactions with the GenAI tools (i.e., students’ analytics) and potential alerts that might arise from that information (e.g., students suspect of copy & paste from the GenAI tool). To this end, the system database will also query the students’ submissions in the LMS and compare them with the answers provided by the GenAI tools.
Within the student interface, students can make prompts to the GenAI tool selected by the teacher, with its own interface (
e.g., ChatGPT embedded inside the LMS), thus simulating this same tool embedded in the course. These prompts will be received by the “Middleware" box which will store them in the data base (analytics) and forward it to the GenAI tool after applying the configured add-ons. As the teacher might select different GenAI tools, the prompt will be forwarded using GenAI adapters so the Application Programming Interface (API) of the selected tool is adequately used. Then, the adapters will receive the answer from the GenAI tool and forward it to the “Middleware” box. The “Middleware” box will store the reply in the DB (genAI analytics), adapt it if configured by the teacher before, and display it in the student interface, as if the original GenAI tool did it. Table
1 outlines how this architecture satisfies the aforementioned design requirements.
3.3 System Prototype
A first version of the prototype has been developed using Balsamiq
4, a low-fidelity wireframing tool for developing interactive digital tools. This low-fidelity prototyping tool allows for a rapid development of the interfaces for a first evaluation with target users. The prototype was fed with fictional data supporting the different design requirements described before (
e.g., students who asked ChatGPT for more exercises, students who used it to copy & paste solutions). Figure
4 and Figure
5 show different interfaces of the system prototype which were used during its evaluation.
5 Results
5.1 Teacher Autonomy with the GenAI System
Results (see Figure
6) show that all participants agreed that configuring the GenAI tool according to their preferences keeps their autonomy high within learning situations involving GenAI tools (Scenario 2). Among the reasons provided by the participants, we can highlight that it allows to
“guide students in the thinking process” or the system is
“well-defined and controllable”, thus providing teachers with enough freedom to execute their preferences.
This is not the case of Scenario 1. Although most participants agreed that monitoring students’ GenAI use keeps their autonomy high (4 participants), 3 neither agreed nor disagreed, or disagreed (Scenario 1). The positive answers point out that the analytics “allow to tutor students” and “challenge students more”. On the other hand, the negative statements refer to “[knowing students’] usage does not immediately say everything about autonomy” or they “can also see in class how they are working”.
5.2 Usefulness of the GenAI System
The system was useful for both scenarios (see Figure
6). Regarding monitoring students’ interactions with the GenAI tool (Scenario 1), two participants strongly agreed that the system is very useful, specially,
“to see how students arrive at an answer, because that is where the learning happens and so, you, as a teacher, can support them with that too”. Another participant pointed out the benefit of knowing more about
“the closed ’cage’ where the students are” when using this kind of tools, for instance,
“seeing what students ask, and whether they copy things blindly”. The two participants who did not provide positive answers argued that for them
“controlling is not a priority” and refers that the important factor is the
“work attitude and commitment of the student”.
Within the second scenario (configuring the students’ prompts and their answers), most participants (4 out of 5) agreed on the usefulness of this feature. Among the reasons, participants mentioned the usefulness of “giving the learner a constraint because otherwise they make up too much of it, often non-sense stuff”. Another participant pointed out an important factor to be considered when using the tool: “Yes (it is useful), but it has to be subtle. If the system locks up too much, there are plenty of other places where they can find a model without restrictions”.
5.3 System Usability and Adoption
The Net Promoter Score is calculated as the percentage of system promoters (i.e., participants selecting 9 or 10 in the likelihood-to-recommend item) minus the percentage of system detractors (i.e., participants selecting 0 to 6). Positive scores indicate its potentiality for user loyalty and adoption. The score obtained in the first scenario was 0 (2 promoters, 2 detractors) and -40 (0 promoters, 2 detractors) for the second scenario. The features that participants liked least might be the reasons among these results. These features are the additional time that it requires for setting everything up and monitoring students (2), the visualization with how many prompts per student (not informative, seems like a lot of fold-outs), or more options (e.g., generate exercises) for restricting the answer when students ask for the direct solution (1).
Participants also identified those features that they liked most. While there is not a concrete feature most liked by participants, each of them pointed to different ones (related to the design requirements): the suspicious copy-paste alerts (1), the analytics overview (1), the individual student analytics, allowing to provide personalized feedback (1), the configuration of forced hallucinations (1), the option of offering scaffolding instead of the direct answer (2), etc.
6 Discussion & Conclusions
Results showed the potential of the proposed system to increase teachers’ autonomy within the two provided learning scenarios, which involved the use of GenAI tools. Furthermore, most teachers highlighted perceived usefulness for monitoring the students interactions, and for fine-tuning the answers provided by these tools. Finally, the results also revealed some issues regarding its usability and adoption, thus leading to a new methodological cycle actively engaging teachers as co-designers to help us collect design requirements and interface recommendations (
i.e., human-centred design approach [
16]). In the next evaluation, we also aim to involve more teachers from different educational levels to understand whether there are any differences in their perceived autonomy. For that evaluation, we also plan to use construct-related validated questionnaires and other data gathering techniques (
e.g., focus groups) to triangulate teachers’ data with a broader sample.
One concern raised by some participants, and which we deem relevant to address, is how to better display analytics about GenAI interactions (i.e., GenAI analytics). While the learning analytics field is well developed for numerical indicators such as the number of submissions, posts, clicks or time spent in the course, it seems interesting to understand how to visualize the students’ interactions with GenAI tools in a meaningful and actionable way for the teachers. Consequently, we have added this concern to our road map and, we will explore it in the next evaluation with teachers.
The students regard another important stakeholder that should be involved in the development of the proposed system. As pointed out by one participant, engaging students with GenAI tools (e.g., ChatGPT) that monitor them could be challenging as they might feel observed and would prefer to use other alternatives available on the Internet. To prevent this behaviour, we could offer dedicated models performing better than the general-purpose ones, or premium versions (e.g., ChatGPT-4o), that otherwise would need to be afforded by the students.
In this regard, we plan to develop our own LLM to provide more accurate answers related to concrete curricula as compared to other general-purpose models. Another advantage of developing an LLM is that all queried information is retained within the own system. Data privacy plays a crucial role in the proposed system as teachers will be aware of the students’ interactions with the GenAI tools. However, since this information might also include questions and prompts not related to the course topics rather to student personal interests, further research should be focused on how to adequately inform students about their privacy rights within these systems.