Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Exploring The Potential

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

CONTRIBUTED

ARTICLES ARTICLES

Exploring the Potential


of Artificial Intelligence
Program Generators in
Computer Programming
Education for Students
By Carrie Anne Philbin, King’s College London

INTRODUCTION high-level task descriptions and even provide explanations for

T he current discourse on Machine Learning (ML)


and Artificial Intelligence (AI) in education revolves
around two key themes. First, there is a need to equip
code snippets and programs. While Codex and its related appli-
cations, like the ‘ChatGPT’ chatbot, show promising potential,
limited empirical research is available regarding their future
young individuals with comprehensive knowledge and role in education since they are relatively recent prototypes [21].
understanding of ML and AI, which have become ubiquitous According to OpenAI, Codex is a versatile programming
in society and industry. This includes aspects such as social model applicable to various programming tasks, proficient in
implications, ethical considerations, practical applications, multiple languages including those commonly used for teach-
models, and engines [35]. Second, there is a growing interest ing beginners, like Python, JavaScript, and Ruby. Codex has also
in exploring the impact of ML and AI tools on teaching found utility in software applications, such as Github Co-Pi-
and learning, both for educators and their students. This lot, assisting professional developers in code reviews. However,
paper focuses on the latter theme, specifically examining there is a growing need to understand the implications of em-
the implications of recently developed ML and AI tools ploying these tools in computer programming education [36],
for computing education, with a particular emphasis on including concerns of potential misuse by students to complete
novice text-based programmers and their ability to learn assessments. Recent research has explored topics such as de-
conceptual knowledge in computer science (CS). The veloping educational content and assessments with AI program
objective is to investigate the potential effects of these tools generators [9] and investigating the effectiveness of co-creation
on students’ experiences of learning to program from a with AI in programming education for fostering creativity and
cognitive perspective. design skills [13]. While some studies have examined Codex,
others have focused on tools like Copilot. For the purposes of
Machine Learning refers to a broad range of computer sys- this review, the term “AI program generators” encompasses
tems that can learn new functions by generalizing from data both Codex and Copilot.
[32]. Natural language engines, on the other hand, are machines This review has several aims. Firstly, it aims to evaluate the
trained on extensive bodies of text and programming code to opportunities and risks associated with instructional approaches
generate new content and provide suggestions. The integration to learning computer programming, with a focus on the impli-
of natural language AI in programming has revolutionized the cations of using AI tools. Secondly, it aims to discuss the poten-
way programmers approach coding tasks. These AI agents, tial challenges in assessing student performance when AI tools
such as ‘Codex’ developed by OpenAI, can generate code from are involved in the learning process. Lastly, it seeks to assess the

30 acm Inroads 2023 September • Vol. 14 • No. 3


ARTICLES

practical application of AI trained on public data in educational Learning to program is


settings and examine the potential adverse effects that may arise.
By addressing these objectives, the study aims to contribute to concept-rich, leading to cognitive
the existing literature and provide valuable insights into the im-
plications of machine learning (ML) and artificial intelligence overload, which requires
(AI) tools in the field of computer science education.
The current literature on AI tools in education is still in
novices to have a secure mental
its early stages. Although there is a limited number of studies model of computation …
available specifically focusing on AI tools for computer pro-
gramming, it is important to note that most of these studies
have primarily concentrated on older populations rather than on the learner: (1) Intrinsic load relates to the complexity of
secondary or high school students. Little empirical research is the learning task and the learner’s existing understanding, and
available on novice programming instruction using AI tools to (2) Extraneous load, the additional stress placed on the learner,
draw on. brought about by external conditions in the learning environ-
Significant efforts have been made to enhance programming ment, for example.
education, prompted by research that revealed the limited ac- There are several ways an educator can impact students’
cess to computer science education in both the US and the UK, cognitive load whilst learning to program. First, they might
particularly among low-income students, students of color, and consider how to present information related to a learning task
young women [16]. It became evident that despite technology’s to avoid additional load. Second, using worked examples in the
pervasive influence in our daily lives, only those with signifi- activity design [2] to help scaffold the learning. Another bene-
cant wealth and privilege had access to computing education. ficial technique to reduce cognitive load could be collaborative,
The reintroduction of Computer Science into the national cur- such as pair programming, which may distribute the cognitive
riculum in England in 2014, the invention of the Raspberry Pi load among learners [31]. Using these perspectives for teaching
computer, the “CS for All” movement and the launch of Code. computer programming and reducing cognitive load emerging
org’s hour of code activities are just a few examples of initiatives from the literature, the following teaching and learning theories
aiming to teach computer science and programming in ways have been selected to explore the impact AI tools may have on
that are culturally responsive, engaging for diverse students and these approaches: (1) Program comprehension and Schulte’s
communities, and more inclusive. ‘Block Model;’ (2) Vygotsky’s concepts of ‘the zone of proxi-
However, computer programming is recognized as one of the mal development’ (ZPD), and the ‘more knowledgeable other’
most challenging aspects of learning computer science [15,18] (MKO) [34]; and (3) Papert’s constructionism [24].
and there has been a growing interest in teaching programming
to young learners over the past decade. Researchers and educa-
tors have been dedicated to better understanding the difficulties OPPORTUNITIES AND RISKS
that students face in programming. Research in this area cov- CONCERNING INSTRUCTIONAL
ers a broad range of topics, including contextual barriers, poor APPROACHES
perceptions of computer science, bugs, and misconceptions in PROGRAM COMPREHENSION AND THE BLOCK MODEL
program construction. One area of agreement among scholars Learning to program is concept-rich, leading to cognitive over-
is the high mental effort required by learners when they begin load, which requires novices to have a secure mental model
to learn programming [12,18,31]. A number of approaches have of computation [12,31]. Program comprehension is essential
been explored by scholars that include individuals as well as so- in learning to program to overcome difficulties novices face
cial learning theories. Cognitive load theory provides a valuable connecting conceptual knowledge with programming practice
framework for examining the challenges faced by novice pro- [28]. The hypothesis presented asks if the use of AI generators
grammers and offers opportunities for improving instructional in programming instruction aids learning through program
approaches, and has been selected as the underpinning theo- comprehension.
ry for this study to provide a focus for theorizing the impact There are several code comprehension pedagogical models.
of AI program generators on learning to program. Cognitive For example, Izu et al. [12] identified more than 60 activities to
load theory builds on the suggestion that human memory has support novices in developing their program comprehension.
two distinct areas, short-term working memory and long-term They include the ‘block model’ developed by Schulte [27], de-
memory. Working memory is limited, and humans give most of scribing how beginners understand a program through reading.
their cognitive resources to this activity. Over time, disparate The block model is helpful for understanding and categorizing
information elements connect with current understanding into aspects of program comprehension, which describes the core
collections of related knowledge called schemas. Stresses build sections of understanding a program. The model is expressed as
on working memory, mainly when presented with new learn- a table (Figure 1) covering three dimensions across four levels,
ing [2]. Sweller’s [14] research suggests that during a learning where columns represent the aspects of program comprehen-
episode, there are two critical stresses or cognitive loads acting sion and rows the hierarchical levels (28).

acm Inroads • inroads.acm.org 31


ARTICLES
Exploring the Potential of Artificial Intelligence Program Generators in Computer Programming Education
for Students

Figure 1: The Block Model Matrix (Izu, Cruz and Schulte, et al. 2019)

The three dimensions of comprehension in the block To explore the function dimension, educators could ask
model are text surface, program execution and function. A novices to explain or act out a program generated by the AI
computer program is a piece of text made of letters, symbols, application. In addition, educators could instruct students to
and numbers; this is called the text surface and is concerned compare multiple programs developed by the AI application
with the grammar and syntax of a program. When a program to solve a particular set of problems to find which are func-
is executed, it becomes dynamic and may behave in different tionally equivalent. Finally, from experience using Copilot, it is
ways depending on its inputs—also known as program clear that it has been trained in a way that does result in some
execution. Finally, the purpose of the program is defined by inconsistencies in the code it has generated, further providing
its function [20]. The Block Model provides twelve zones of opportunities for learning by having to evaluate the code and
program comprehension in all. Sentance et al. [31] assert interrogate it.
that if educators can devise activities or ask questions about These instruction methods for learning to program that focus
a program that develops computational knowledge in each of on reading code could lower cognitive load [31]. By acting out
the twelve zones of the Block Model, it would better support programs, students engage in a kinesthetic and embodied learn-
students’ understanding of computer programs, reducing the ing experience. They physically simulate the execution of code,
cognitive load. step by step, which helps them develop a better understanding
Research is required to understand and test whether AI pro- of how the program functions. This approach allows students to
gram generators support or disrupt the block model as a meth- visualize the flow of instructions and identify potential errors
od of instruction; however, their use in code comprehension or areas of improvement in their code. By physically enacting
could aid the three dimensions of the block model. For exam- the program, students can experience a more intuitive grasp of
ple, a grasp of the text surface dimension requires novices to programming concepts, leading to a reduction in cognitive load.
discern the meaning of text with unfamiliar terms, structures, By comparing AI-generated computer programs, students can
and syntax. A popular instructional strategy is identifying code examine and analyze the code generated by AI program gener-
aspects within the text, in some cases by creating a sorting ac- ators, alongside their own code. This comparison allows them
tivity, where learners need to construct the program using text to gain insights into alternative approaches, different program-
snippets [23]. Copilot can produce programs quickly, providing ming styles, and more efficient solutions. Through this process,
novices with experience in different ways to solve a program- students can identify patterns, best practices, and innovative
ming problem and, crucially, providing opportunities to iden- techniques, which can enhance their own programming skills
tify pieces of code. The AI program generator can help novic- and reduce the cognitive effort required for problem-solving.
es make sense of the text by identifying examples of variables, However, using an AI code generator for program com-
conditions, and functions. Similarly, novices could investigate prehension may be superfluous to lowering the cognitive load
the effect of swapping two lines of code or introduce bugs to of novices, as educators can achieve the same results through
see how the AI program generator resolves them, developing preparation and pedagogy without needing a computing device,
mental models of program execution. for example through an unplugged activity [3]. Moreover, using

32 acm Inroads 2023 September • Vol. 14 • No. 3


ARTICLES

an AI program generator only works if the programs returned Research is required to understand
are simplistic enough for novices to engage with and under-
stand. This seems unlikely, as the training data set used to train and test whether AI program
AI program generators currently is derived from thousands of
GitHub repositories of expert developer-generated code. Scaf- generators support or disrupt the
folded worked examples that educators have carefully curated
reduce cognitive load. Gujberova and Kalas [11] recommended
block model as a method of
a sequence of carefully graded learning activities where learners instruction; however, their use in
read and interpret each line of code gradually. Off-the-shelf code
solutions provided by experts on the other hand, which have not code comprehension could aid the
been scaffolded for novices, may be more likely to exacerbate
the cognitive load on novice programming by introducing mis- three dimensions of the block model.
conceptions [31]. Similarly, In the early stages of re-introducing
computer programming to the curriculum, in my experience, consistently challenged at a cognitive level, enabling them to
students would often use the website ‘StackOverflow.com’ to enhance their skills and confidently undertake tasks without
generate code. The website acts as a database of questions and the need for constant supervision [38]. An AI program gener-
answers for expert programmers. Asking for a generic and sim- ator could be a MKO for students learning computer science
ple solution often results in several different responses, with dif- concepts and skills through social interaction, which therefore
ferent approaches, and verbose descriptions. In turn this adds may lower the cognitive load as described by Zulu et al. [38].
to the cognitive load. Chatterjee et al. [5] found in their study of In theory, this seems plausible, with tools like Copilot in their
novice software engineers use of Stack overflow to find needed current incarnation. For example, being able to describe to the
information, that only 27% of the code and 16–21% of the nat- AI application what the novice wants to achieve through a kind
ural language text could be useful in helping them to read and of conversation, with the AI application outputting suggestions.
apply the information to their programming problem. Careful It could be a dynamic similar to pair programming where ob-
scaffolded instruction is required therefore in reducing the cog- servation and practice can make concrete, complex concepts of
nitive load of creating program solutions both through reading learning to program.
and applying code, something that requires a skilled educator. However, the AI program generator is not another human.
It is trained on millions of lines of code shared by experts. If
AI AS A MORE KNOWLEDGEABLE OTHER used for programming instruction in secondary schools, it may
This section explores how students learn and apply computer be more akin to a ‘super knowledgeable other,’ generating ex-
science knowledge using AI program generators and ChatGPT. pert-level code and providing the solution without requiring
As well as, how practice might be informed by the work of Sovi- any cognitive load. Avoiding the cognitive load in this way may
et psychologist and social constructivist Lev Vygotsky through not engender any learning. Furthermore, the AI will inherently
the concepts of the zone of proximal development (ZPD) and lack an understanding of cultural practice and emotional en-
the more knowledgeable other (MKO) [34]. gagement that is part of learning, and part of the sociocultural
The zone of proximal development (ZPD) is defined as: “the context of learning that Vygotsky emphasizes [34]. Additionally,
distance between the actual developmental level as determined AI program generators always try to directly answer the ques-
by independent problem solving and the level of potential de- tion posed, while a human that is attempting to support learn-
velopment as determined through problem-solving under adult ing might use questioning or provide partial solutions in order
guidance or in collaboration with more capable peers.” [34] to guide the student. For example, when a human is learning
When a student is in the zone of proximal development for a with another human, they are able to ask questions of one an-
particular task, assistance is provided through social interac- other and joke with each other in ways that are culturally-con-
tion and dialogue between students and educators. In applying textualized that, in turn, can support deeper learning or help
this concept, educators are encouraged to consider ensuring them connect in new ways with the material. This may not be
that a more knowledgeable other (MKO) is present whose skills achievable when interacting with AI. Finally, a possible mindset
or knowledge are beyond the student, then provide oppor- that the AI application is always correct, combined with a lack
tunities for interaction with the MKO to allow the learner to of understanding of how it works, may lead to adverse learn-
observe and practice skills [19]. For example, teachers support ing outcomes and experiences [13] and disrupt the confidence
students by explaining concepts through responses to questions of secondary school students. Research analysis by Webb et al.
and modelling what they require students to do. Here teachers [35] suggests that designers of AI applications built on machine
were explicitly using their own expertise as a “more knowledge- learning should include the capability for the system to explain
able other” (MKO) [34]. By continuously providing students its decisions to users. From an ethical and accountability point
with tasks in their Zone of Proximal Development (ZPD), their of view, this would be wise, but moreover, it could help dis-
ability to independently perform a wider range of tasks can be pel the myth that AI applications are like humans [17]. Since
consistently expanded. This approach ensures that students are the launch of ChatGPT in November 2022, there are anecdot-

acm Inroads • inroads.acm.org 33


ARTICLES
Exploring the Potential of Artificial Intelligence Program Generators in Computer Programming Education
for Students

al reports that due to the question-and-answer nature, it is a cognitive load. If an activity is too complex, students may ex-
more useful tool than Codex or Copilot as a MKO in generating perience cognitive overload, which can negatively impact their
programs. A user can ask ChatGTP why it chose to present a attitudes towards learning to program and lead to difficulties
particular solution, and it will return a natural language answer. [22]. Papert [24] describes a qualitative problem-solving ap-
There are examples of ChatGPT acting as a tool to help a user proach where individuals continuously experiment and refine
debug their code (Figure 2). It checks the code, explains what their artefacts until they are completed, which if implemented
the bug is, and provides a solution in order to fix it. If the ex- and supported by educators, can reduce cognitive load. Imple-
planation appears to be too complex for a user, ChatGPT can menting this approach can broaden skills such as collaboration,
be instructed to explain concepts to a particular age group, and design, and prototyping, as well as being an effective way to
it will return an age-appropriate response. These are examples create an engaging learning environment [30]. Resnick [26] ar-
of modelling and dialogue between a novice and a MKO which gues that while solving puzzles and problems may help develop
could result in better attainment of knowledge about computer the cognitive processes to learn to program, creating projects
programming and reduce the cognitive load on learners. takes learners further, developing their voice and identity and
helping students develop as creative thinkers. Using an AI code
generator and a constructionist approach, students can theo-
retically be more experimental in their learning than starting
from a blank slate, argues Resnick [26] echoing Papert’s vision
that young learners should be seen as ‘active constructors’ rath-
er than ‘passive recipients’ of knowledge. Using worked exam-
ples or chunks of code to construct programs to achieve their
desired outcome can empower novices, keep them engaged in
learning and reduce the cognitive load by removing difficulties
concerning the text surface, like using the correct syntax.
A consistent risk for learning described by studies that use
constructionist approaches is student resilience. As described
earlier, programming requires high mental effort, even if the ex-
perience is positive and engaging (31). Interviews with students
participating in physical computing activities show that they
found the tasks hard, describing it as “a steep learning curve”
[30, p. 6]. Could the use of AI program generators promote cre-
ativity and construct knowledge by taking care of the blockers
of programming, keeping students engaged, which is a pre-
cursor to reducing cognitive load [22]? Jonsson and Tholander
[13] found that participants in their study approached the AI
application as a tool to speed up the process of programming,
whilst others treated it in a more open-ended fashion. The re-
search showed that the generative and open-ended approach
created a collaborative and conversational process between the
participant and the AI application. Codex, in effect, worked as
Figure 2: ChatGPT identifying a bug in code along with a natural a co-participant in the creative design process [13]. This evi-
language description (https://chat.openai.com) dence would suggest that using an AI program generator makes
learning to program more enjoyable and a way to avoid being
CONSTRUCTIONISM blocked by text surface described in the block model. It also
A well-established approach to learning computer program- relates to Vygotsky’s social learning theory, with the AI acting
ming is through student-centered discovery and project-based as a MKO. This approach also better reflects how expert pro-
learning, which involves students creating and implementing grammers work. The creators of Codex, OpenAI [21], suggest
their ideas by integrating various knowledge [24]. In a study on that professionals who write computer code often break prob-
coding activities based on constructionism, Papavlasopoulou et lems down into smaller problems and then find existing code
al. [22] discovered that students who exhibited high levels of libraries or APIs to use to solve those problems. It is likely that
engagement and motivation while constructing their artefacts the cognitive load of writing computer code, even at the expert
demonstrated gaze behavior that indicated reduced cognitive developer level, is high. OpenAI believes this activity is a high
overload. The coding process requires significant effort and barrier to entry into the profession, as it is time-consuming and
presents challenges, necessitating intrinsic motivation and per- not a particularly creative endeavor. Writing code for novices in
severance from children as they engage in learning. Hence, the secondary school can also be time-consuming. However, proj-
design of programming activities plays a critical role in novices’ ect-based learning coding activities based on constructionism

34 acm Inroads 2023 September • Vol. 14 • No. 3


ARTICLES

may offer students more creative agency and better learning [Resnick] argues that while solving
outcomes. Resnick [26] suggests that as students work on proj-
ects, they gain an understanding and experience of the creative puzzles and problems may
process, whilst also connecting with their own interests.
Working independently on programming has been suggest- help develop the cognitive processes
ed to have a higher cognitive load than working collaborative-
ly through pair programming [31]. Thinking of the AI program
to learn to program, creating
generator or ChatGPT as a collaborator in conversation in a projects takes learners further,
tangible creative project could support novices in their learn-
ing. Creativity can often be limited by an individual’s access to developing their voice and identity
knowledge or materials. However, by working alongside AI appli-
cations, creativity can be promoted by removing the limitations and helping students develop as
and through the joint actions of participants and the system [13].
Using AI program generators through a project-based and
creative thinkers.
more open-ended approach may also create new challenges
that impact a novice’s learning to program. Allowing the AI to generators such as Codex and, more recently, Copilot are freely
do the heavy lifting in constructing a program requires the hu- available for students and educators [10] and, hypothetically, are
man collaborator, our novice programmer, to instruct the system already being used by CS students at the undergraduate level.
to achieve the desired results. The study presented by Jonsson Finnie-Ansley et al. [9] assessed the accuracy of the Codex AI
and Tholander [13] found that participants had to put significant engine when applied to introductory programming tests used at
effort into formulating their query for Codex to generate valid the author’s institution, the University of Auckland, New Zealand,
code. Therefore, younger learners and novices may not yet have to assess undergraduates in the early stages of their CS course. The
developed a particular “economy of language” [13] to speak to “results show that Codex performs better than most students on
the AI program generator. This “economy of language” may neg- code-writing questions in typical first-year programming exams”,
atively impact the learning experience and be a barrier to devel- and “solutions generated by Codex appear to include quite a lot
oping the concepts and skills of programming a system. Again, of variation, which is likely to make it difficult for instructors
ChatGPT has the potential to overcome this challenge as it can to detect” [9, p.16]. Therefore, to mitigate this risk, educators
be instructed to provide age-appropriate responses. Similarly, to need training in basic AI and ML literacies and to rethink how
web searching and research skills that have been added to curric- these kinds of items can be used in assessment, before deciding
ulums over the past several decades, techniques to get the most on how they teach and assess learning using AI and ML tools.
from AI program generators could be taught in future courses. A description of basic AI and ML literacies that could be
incorporated into teacher professional development is suggested
by Webb et al. [35]. They include understanding: the machine
OPPORTUNITIES AND RISKS REGARDING learning processes that support learning, how to act responsibly
ASSESSMENT within society, and critically assessing the ethical implications
AI program generators such as Codex and Copilot may aid that AI systems trained on data raise. Yang [36] argues that ideas
computer science education. However, many educators will concerning the ethics of AI can be developed by participating in
likely have more severe and immediate concerns with these learning activities as part of an AI and ML literacy curriculum.
tools in the hands of their students [9] relating to plagiarism However, Tedre et al. [32] claim that participation is insufficient
and cheating in summative assessments. A common way to as- and that educators must rethink conceptual knowledge through
sess computer programming concepts and skills is through lab- a new ML and AI paradigm.
based tasks and unit tests where software “judges” programs
submitted by users, by compiling, executing, and evaluating ACCOUNTABILITY FOR LEARNING ACHIEVEMENTS
the code. It is also common practice for software engineering AI applications trained on open data may pose accountabili-
job interviews. The following sections look at the implications ty challenges for learning achievements [35]. For example, it
for student integrity and educator approaches to assessment may need to be clarified in advance who is driving the deci-
by first summarizing the research relating to plagiarism with sion-making in formulating a computer program and taking
AI, the moral and ethical implications that may arise from its responsibility for those decisions. Webb et al. [35] highlight the
use, and the possible over-reliance on a tool that may have been differences between artificial intelligence decision-making and
trained with insufficient data. human decision-making, particularly from a moral and ethical
point of view. Measuring student achievement of CS concepts
PLAGIARISM AND CHEATING through programming exercises and deciphering accountabili-
Academics cite plagiarism as having a higher occurrence in ty for aspects of the program may be problematic and complex
computer science than in many other disciplines, with an for educators. Human decision-making involves a complex set
estimated 80% of computing students involved [9]. AI program of judgements and feelings relating to the context or scenario in

acm Inroads • inroads.acm.org 35


ARTICLES
Exploring the Potential of Artificial Intelligence Program Generators in Computer Programming Education
for Students

front of them. AI decision-making, based on ML, does not have Intellectual property rights pose
the same flexibility or empathy.
Intellectual property rights pose another consideration for use another consideration for
in secondary school contexts. For example, tools trained on open
data may violate the legal rights of creators. Codex is trained on use in secondary school contexts.
code from a large open-source repository [10]. The code creator
or contributor has not necessarily provided permission under
For example, tools trained
the terms of an open license for their code to be used to train on open data may violate the
an AI engine or be used to create new programs. A class-action
lawsuit has been filed against GitHub, Microsoft, and OpenAI legal rights of creators.
in US federal court on behalf of millions of GitHub users on 17
October 2022. The filed suit claims to be the first class-action
case in the US challenging the training and output of AI systems cause harm using a tool, they may inadvertently do just that. On
[4], which is likely to have far-reaching implications for the de- the other hand, knowing outputted code is likely to be inaccu-
velopment and use of AI technologies. It may take some time rate could aid learning by emphasizing the importance of read-
for a judgement to come. In the meantime, if secondary school ing code and applying a code comprehension approach which
students and educators are using AI program generators to learn investigates code snippets [31].
to program, then consideration should be given to who owns the Being a responsible citizen means having integrity. A social
code, if its use is permissible under open-source licensing, and construct that implies that cheating on tests is a negative attri-
who is accountable for the source of the code output. bute. However, society may rethink what cheating is, with the
Consider accountability in terms of assessment, where stu- introduction of AI applications and code generators. Further-
dents may use artificial intelligence applications collaboratively more, once aware of the risks AI applications pose, educators
to create an output. For example, how can an educator set mark- could think more about assessment methods and place more
ing criteria, such as assessment rubrics for collaborative projects value on program comprehension and students’ explanations of
with an AI application? However, the same could be asked of two the code to demonstrate their computer science competency.
students collaborating. Resnick [26] argues that education as- However, more than basic AI literacy training may be required
sesses students’ progress based on evidence, typically expressed if educators and students already have constructed a mental
as numbers and statistics. Instead, education should also value model that an AI application is always correct [13].
documenting what students learn by encouraging reflective prac-
tice illustrating what they have created, what others have input,
how they created it and why. This reflective and evidence-based RISKS CONCERNING SAFEGUARDING THE
approach may support learning and mitigate student achieve- USE OF AI PROGRAM GENERATORS
ment accountability if AI program generators are used in cre- Selwyn [29] argues that using AI products can result in social harm
ating computer programs, however recent AI applications such in educational contexts, especially for minority groups, because
as ChatGPT are able to provide text that students could use to AI models are trained on already discriminatory data, further
provide evaluation and reflection for the purposes of assessment. amplifying social harm. Facial recognition systems, for example,
This is akin to Papert’s [24] qualitative problem-solving approach regularly fail to recognize students of color [29]. This idea is also
where individuals continuously experiment and refine their arte- shared by Chen et al.’s [6] evaluation of Codex, which found
facts until they are completed, which reduces the cognitive load that code is generated with a structure that reflects stereotypes
of a learner and increases their motivation. However, this short- about gender, race, class, and other protected characteristics.
cut to complete a task that is designed to take time, and build on Therefore, its use in secondary CS education could exacerbate
the experiences of the learning, may be detrimental to the learn- inequality experienced by marginalized groups and individuals.
ing experience described by Resnick [26]. More concerning was Chen et al.’s. [6] discovery that Codex
“can be prompted in ways that generate racist, denigratory,
ACCURACY OF OUTPUTTED CODE and otherwise harmful outputs as code comments, meriting
Another consideration for novice programmers using AI pro- interventions.” In educational contexts, it is the responsibility
gram generators is a possible over-reliance on the output code of schools to safeguard young and vulnerable individuals;
[6]. Codex, for example, is trained on crowd-sourced, publicly therefore, a tool that can generate harmful comments could not
available code generated by humans and posted to the website and should not be used. Additionally, a student experiencing
Github. Therefore, solutions developed by the AI application discrimination would feel discouraged to learn, which directly
may use poor style, tackle programming problems in a sub-op- impacts on their cognitive load, underpinned by engagement
timum way for building mental models of computation, and and intrinsic motivation [22].
introduce misconceptions that prevail. Finne-Ansley et al. [9] Nevertheless, AI applications are still learning, and products
compared Codex in the hands of students to a power tool in the like AI code generators are still evolving. For example, Copi-
hands of a novice. Whilst novices and students may not wish to lot’s creator Github has included filters to block language which

36 acm Inroads 2023 September • Vol. 14 • No. 3


ARTICLES

could offend. They also provide a mechanism to report offen- The discourse on AI program
sive outputs directly to them, as they state they are committed
to addressing the challenge. Whether these interventions safe- generators should swiftly transition
guard the young and marginalized groups need investigating.
Initial evaluations of AI products trained with code have into policy and practice.
additionally found that the programs produced are only sometimes
reliable or accurate. Furthermore, they can even suggest insecure
Educators and policymakers must
code [6]. Therefore, their use in education systems could present raise awareness of the
challenges for infrastructure if a program is executed that causes
damage to computer systems and networks, although this risk potential impact and establish
is also present if relying on code found from the internet or
code written directly by learners. Chen et al. [6] suggest that supportive practices to
“human oversight and vigilance is required for safe use of code
generation systems.” This assumption requires educators to have
guide teaching and assessment.
significant knowledge and experience, which may be different
for schoolteachers with an emerging subject like CS. GitHub reading and comprehension, social learning, and construction-
confirms that Copilot “may contain insecure coding patterns, ism. While this study focuses its analysis specifically on cogni-
bugs or references to outdated APIs or idioms.” [10] However, tive load of novice programmers in the act of learning, it is hard
GitHub suggests that combining a professional developer’s not to also ask the question about what exactly students learn
knowledge and judgement with established testing and code when using AI program generators? Are they simply using how
review practices alongside security tools should mitigate this to navigate an AI tool, or are they learning the critical think-
risk. However, novice programmers and their instructors may ing skills of problem solving and key aspects of algorithmic
have a different understanding, training, or experience to know thinking as well? While cognitive load may be decreased, what
it is unsafe. Again, cognitive load of novice programmers may are students ultimately gaining intellectually and academically
increase, if code output by an AI is continually incorrect, lowering when using AI program generators? Additionally, does the use
a student’s engagement and motivation. of AI tools diminish the value of computing as a subject for
Although research shows that output code from AI code further study by students? These themes would be important to
generators is often inaccurate, unreliable, and can cause so- explore in greater depth in future research to better understand
cial harm and safety concerns [6], it does not seem to deter the impact on novice programmers’ ability to learn conceptual
professional programmers and software engineers from using knowledge of CS as well as their level of engagement.
them. One user is quoted on the landing webpage for GitHub’s It is evident that AI program generators were not originally
Copilot, saying it “works shockingly well. I will never develop developed with educational contexts in mind. They primarily
software without it again.” [10] Studying the impact on profes- aimed to support software developers in tackling complex tasks
sionals would be an interesting comparison for research. How with the assistance of an intelligent collaborator. However, giv-
are mental models of computation affected, if at all? How do en the launch of tools like ChatGPT and the recent introduc-
experts converse with AI applications to achieve a desired re- tion of Ghostwriter AI by Repl.it [25], the impact on education
sult? What could be learned from a professional’s approach that is becoming increasingly apparent. Secondary school educators
could support learners? face significant challenges in terms of instructional practices,
A mitigation for the risks that AI presents to the wellbeing assessment methods, and student safeguarding across various
of a student has begun by engaging young people in discussions subjects. Developers of educational AI code generators should
relating to AI. Incorporating children’s perspectives and values create dedicated versions trained on data curated by educators
is crucial for the advancement and implementation of AI tech- themselves. This would provide educators with more confi-
nology. Their insights can inform ethical practices and ensure dence in the quality of the generated code, ensuring it aligns
that AI development aligns with their needs and values [1,35]. with the development of mental models of computing. Ad-
UNICEF [33] has created a set of nine principles that priori- ditionally, the involvement of policyholders in formulating a
tize children’s perspectives and have established the basis for ‘Code of Conduct’ for users and developers is recommended to
child-centered artificial intelligence (AI). This approach places establish accountability in the use of these tools in educational
children’s voices at the forefront and emphasizes their active settings. Furthermore, ethical considerations should be priori-
participation in all stages of the AI lifecycle. Child-centered AI tized by developers from the outset.
is not solely focused on minimizing harm but also aims to fos- The discourse on AI program generators should swiftly
ter innovative and beneficial approaches in the design, develop- transition into policy and practice. Educators and policymakers
ment, and implementation of AI technologies [8]. must raise awareness of the potential impact and establish
Firstly, the cognitive load of students learning to program supportive practices to guide teaching and assessment.
has the potential to be reduced by utilizing AI programming Whilst the focus of this study has not been on developing
tools alongside well-established teaching practices such as code understanding of AI within computing education contexts, the

acm Inroads • inroads.acm.org 37


ARTICLES
Exploring the Potential of Artificial Intelligence Program Generators in Computer Programming Education
for Students

risks highlighted in this paper demonstrate that issues of ethics and Secondary Computing Education, (Morschach Switzerland: ACM, 2022), 1–2;
https://doi.org/10.1145/3556787.3556872.
and social responsibility are increasingly important. Basic AI 18. Mayer, R.E. The psychology of how novices learn computer programming.
literacy should become an essential component of educators’ ACM Computing Surveys (CSUR), 13, 1 (1981), 121-141; https://doi.
org/10.1145/356835.356841.
professional development, allowing them to stay informed 19. McLeod, S.A. What Is the zone of proximal development? Simply Psychology
about new developments, ensure student safety, understand (2019). www.simplypsychology.org/Zone-of-Proximal-Development.html; accessed
2022 Nov 4.
accountability, and enhance their pedagogical practices. 20. National Centre for Computing Education (NCCE) (2020). Pedagogy Quick Reads:
Curriculum content should also be expanded to teach basic AI Understanding program comprehension using the Block Model. National Centre for
Computing Education & Raspberry Pi Foundation, https://ncce.io/qr12; accessed
literacy in schools to better prepare young people for work.  2022 Nov 4.
21. OpenAI ChatGPT: Optimizing Language Models for Dialogue, OpenAI.com (2022,
Acknowledgements 30 November) https://openai.com/blog/chatgpt/; accessed 2023 Jan 2.
The author would like to acknowledge the Centre for Research in Education in Science, 22. Papavlasopoulou, S., Giannakos, M.N., and Jaccheri, L. Exploring children’s
Technology, Engineering & Mathematics (CRESTEM) at King’s College London, which learning experience in constructionism-based coding activities through design-
supported the author in preparing and submitting this article. based research. Computers in Human Behavior, 99, (2019), 415-427; https://doi.
org/10.1016/j.chb.2019.01.008.
23. Parsons, D., and Haden, P. Parson’s programming puzzles: a fun and effective
References learning tool for first programming courses, Proceedings of the 8th Australasian
1. Aitken, M. and Briggs, M. In AI, data science, and young people. Understanding Conference on Computing Education-Volume 52 (2006), 157–163.
computing education (Vol 3). Proceedings of the Raspberry Pi Foundation Research 24. Papert, S. Mindstorms: Children, Computers, and Powerful Ideas (2nd Ed. 1993).
Seminars, (2022). http://rpf.io/seminar-proceedings-vol-3-aitken-briggs; accessed Basics Books, A Member of The Perseus Books Group.
2023 Jan 10. 25. Repl.it Ghostwriter AI FAQ. Repl.it.com (2022, 7 November) https://docs.replit.com/
2. Bannert, M. (2002). Managing cognitive load—recent trends in cognitive load ghostwriter/faq; Accessed 2022 Nov 9.
theory, Learning and Instruction, 12, 1 (2002), 139–146; https://doi.org/10.1016/ 26. Resnick, M. Lifelong Kindergarten: Cultivating Creativity through Projects, Passion,
S0959-4752(01)00021-4. Peers and Play. MIT Press, 2017.
3. Bell, T., Alexander, J., Freeman, I., and Grimley, M. Computer science unplugged: 27. Schulte, C. Block Model: an educational model of program comprehension as a
School students doing real computing without computers. The New Zealand tool for a scholarly approach to teaching, Proceedings of the 4th International
Journal of Applied Computing and Information Technology, 13, 1 (2009), 20-29.. Workshop on Computing Education Research - ICER ’08. Proceeding of the 4th
4. Butterick, M. and Joseph Saveri Law Firm (2022). ‘GitHub Copilot litigation’ https:// International Workshop, (Sydney, Australia: ACM Press, 2008), (Sydney, Australia:
githubcopilotlitigation.com; accessed 2022 Nov 30. ACM Press, 2008), 149–160. https://doi.org/10.1145/1404520.1404535.
5. Chatterjee, P., Kong, M. and Pollock, L. Finding help with programming errors: An 28. Schulte, C., Clear, T., Taherkhani, A., Busjahn, T. and Paterson, J.H. (2010). An
exploratory study of novice software engineers’ focus in stack overflow posts, introduction to program comprehension for computer science educators,
Journal of Systems and Software, 159, 110454 (2020); https://doi.org/10.1016/j. Proceedings of the 2010 ITiCSE working group reports on Working group reports
jss.2019.110454. - ITiCSE-WGR ’10. The 2010 ITiCSE working group reports (Ankara, Turkey, ACM
6. Chen, M. et al. Evaluating Large Language Models Trained on Code, arXiv, 2021. Press), 65; https://doi.org/10.1145/1971681.1971687.
http://arxiv.org/abs/2107.03374; accessed 2022 Oct 25. 29. Selwyn, N. (2022). The future of AI and education: Some cautionary notes.
7. Cooper, G. Examining science education in chatgpt: An exploratory study of European Journal of Education, 2022, pp. 1–12; https://doi.org/10.1111/ejed.12532.
generative artificial intelligence. Journal of Science Education and Technology, 32, 3 30. Sentance, S. and Schwiderski-Grosche, S. Challenge and creativity: using .NET
(2023), 444-452; https://doi.org/10.1007/s10956-023-10039-y. gadgeteer in schools, Proceedings of the 7th Workshop in Primary and Secondary
8. Data Protection Working Party (2009, February 11). Opinion 2/2009 on the Computing Education on - WiPSCE ’12. The 7th Workshop in Primary and
protection of children’s personal data (General Guidelines and the special case Secondary Computing Education, (Hamburg, Germany: ACM Press, 2012), 90;
of schools). https://ec.europa.eu/justice/article-29/documentation/opinion- https://doi.org/10.1145/2481449.2481473.
recommendation/files/2009/wp160_en.pdf; accessed 2023 Jan 10. 31. Sentance, S., Waite, J. and Kallia, M. Teaching computer programming with PRIMM:
9. Finnie-Ansley, J., Denny, P., Becker, B.A., Luxton-Reilly, A. and Prather, J. (2022). The a sociocultural perspective, Computer Science Education, 29, (2019), 2–3, 136–176;
Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory https://doi.org/10.1080/08993408.2019.1608781.
Programming, Australasian Computing Education Conference. ACE ’22: Australasian 32. Tedre, M.,Denning, P. and Toivonen, T. CT 2.0, 21st Koli Calling International
Computing Education Conference, Virtual Event Australia: ACM, 10–19; https://doi. Conference on Computing Education Research. Koli Calling ’21: 21st Koli Calling
org/10.1145/3511861.3511863. International Conference on Computing Education Research, (Joensuu Finland:
10. GitHub Copilot. (2022, August). Your AI pair programmer, https://github.com/ ACM, 2021), 1–8; https://doi.org/10.1145/3488042.3488053.
features/copilot; accessed 21 Nov 2022. 33. UNICEF (2020, September). Policy guidance on AI for children. UNICEF and
11. Gujberova, M., and Kalas, I. Designing productive gradations of tasks in primary the Ministry of Foreign Affairs of Finland. https://www.unicef.org/globalinsight/
programming education, in Proceedings of the 8th Workshop in Primary and media/1171/file/UNICEF-Global-Insight-policyguidance-AI-children-draft-1.0-2020.
Secondary Computing Education, WiPSE ‘13, (New York, NY, USA, ACM, 2013), pdf; accessed 2023 Jan 10.
108–117; https://doi.org/10.1145/2532748.2532750. 34. Vygotsky, L. S., and Cole, M. Mind in Society: Development of higher psychological
12. Izu, Cruz, and Schulte, Carsten and Aggarwal, Ashish and Cutts, Quintin and processes. Harvard University Press, (1978).
Duran, Rodrigo and Gutica, Mirela and Heinemann, Birte and Kraemer, Eileen 35. Webb, M.E., Fluck, A., Magenheim, J. et al. (2021). Machine learning for human
and Lonati, Violetta and Mirolo, Claudio and Weeda, Renske. Fostering Program learners: opportunities, issues, tensions and threats, Educational Technology
Comprehension in Novice Programmers - Learning Activities and Learning Research and Development, 69 (4), 2109–2130; https://doi.org/10.1007/s11423-020-
Trajectories, Proceedings of the Working Group Reports on Innovation and 09858-2..
Technology in Computer Science Education. ITiCSE ’19: Innovation and Technology 36. Yang, W. Artificial Intelligence education for young children: Why, what, and how
in Computer Science Education, (Aberdeen Scotland UK: ACM, 2019), 27–52; in curriculum design and implementation, Computers and Education: Artificial
https://doi.org/10.1145/3344429.3372501. Scientific Figure on ResearchGate; https:// Intelligence, 3 (2022), 100061; https://doi.org/10.1016/j.caeai.2022.100061.
www.researchgate.net/figure/The-Block-Model-Matrix_fig1_339040166; accessed 37. Zaremba, Brockman, and Open AI (2021, August 10th). ‘Open AI Codex’, Open
2023 Jan 13. AI.com; https://openai.com/blog/openai-codex/; accessed 2021 Oct 13.
13. Jonsson, M. and Tholander, J. Cracking the code: Co-coding with AI in creative 38. Zulu, E., Haupt, T., and Tramontin, V. Cognitive loading due to self-directed
programming education, in Creativity and Cognition. C&C ’22: Creativity and learning, complex questions and tasks in the zone of proximal development of
Cognition, (Venice Italy, ACM, 2022), 5–14; https://doi.org/10.1145/3527927.3532801. students. Problems of Education in the 21st Century, 76, 6 (2018), 864; https://doi.
14. Kirschner, P.A., Sweller, J. and Clark, R.E. (2006). Why Minimal Guidance org/0.33225/pec/18.76.864.
During Instruction Does Not Work: An Analysis of the Failure of Constructivist,
Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching, Educational
Psychologist, 41 (2), pp. 75–86; https://doi.org/10.1207/s15326985ep4102. Carrie Anne Philbin
15. Kuljis, J., and Baldwin, L. P. (2000). Visualisation techniques for learning and School of Education, Communication & Policy
teaching programming. Journal of Computing and Information Technology, 8, 4 King’s College London
(2000), 285-291; https://doi.org/10.2498/cit.2000.04.03. Waterloo Road
16. Margolis, J. Stuck in the Shallow End, updated edition: Education, Race, and London, SE1 9NH
Computing. MIT Press (2017). carrie.anne.philbin@kcl.ac.uk
17. Marx, E., Leonhardt, T. and Bergner, N. Brief Summary of Existing Research on
Students’ Conceptions of AI, Proceedings of the 17th Workshop in Primary and
Secondary Computing Education. WiPSCE ’22: The 17th Workshop in Primary DOI: 10.1145/3610406 Copyright held by authors. Publication rights licensed to ACM.

38 acm Inroads 2023 September • Vol. 14 • No. 3

You might also like