Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3631802.3631806acmotherconferencesArticle/Chapter ViewAbstractPublication Pageskoli-callingConference Proceedingsconference-collections
research-article

How Novices Use LLM-based Code Generators to Solve CS1 Coding Tasks in a Self-Paced Learning Environment

Published: 06 February 2024 Publication History

Abstract

As Large Language Models (LLMs) gain in popularity, it is important to understand how novice programmers use them and the effect they have on learning to code. We present the results of a thematic analysis on a data set from 33 learners, aged 10-17, as they independently learned Python by working on 45 code-authoring tasks with access to an AI Code Generator based on OpenAI Codex. We explore several important questions related to how learners used LLM-based AI code generators, and provide an analysis of the properties of the written prompts and the resulting AI generated code. Specifically, we explore (A) the context in which learners use Codex, (B) what learners are asking from Codex in terms of syntax and logic, (C) properties of prompts written by learners in terms of relation to task description, language, clarity, and prompt crafting patterns, (D) properties of the AI-generated code in terms of correctness, complexity, and accuracy, and (E) how learners utilize AI-generated code in terms of placement, verification, and manual modifications. Furthermore, our analysis reveals four distinct coding approaches when writing code with an AI code generator: AI Single Prompt, where learners prompted Codex once to generate the entire solution to a task; AI Step-by-Step, where learners divided the problem into parts and used Codex to generate each part; Hybrid, where learners wrote some of the code themselves and used Codex to generate others; and Manual coding, where learners wrote the code themselves. Our findings reveal consistently positive trends between learners’ utilization of the Hybrid coding approach and their post-test evaluation scores, while showing consistent negative trends between the AI Single Prompt and the post-test evaluation scores. Furthermore, we offer insights into novice learners’ use of AI code generators in a self-paced learning environment, highlighting signs of over-reliance, self-regulation, and opportunities for enhancing AI-assisted learning tools.

References

[1]
Vincent Aleven, Bruce McLaren, Ido Roll, and Kenneth Koedinger. 2006. Toward meta-cognitive tutoring: A model of help seeking with a Cognitive Tutor. International Journal of Artificial Intelligence in Education 16, 2 (2006), 101–128.
[2]
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, 2021. Program synthesis with large language models. arXiv preprint arXiv:2108.07732 (2021).
[3]
Ryan Shaun Baker, Albert T Corbett, and Kenneth R Koedinger. 2004. Detecting student misuse of intelligent tutoring systems. In Intelligent Tutoring Systems: 7th International Conference, ITS 2004, Maceió, Alagoas, Brazil, August 30-September 3, 2004. Proceedings 7. Springer, 531–540.
[4]
Shraddha Barke, Michael B James, and Nadia Polikarpova. 2023. Grounded copilot: How programmers interact with code-generating models. Proceedings of the ACM on Programming Languages 7, OOPSLA1 (2023), 85–111.
[5]
Brett A Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather, and Eddie Antonio Santos. 2023. Programming Is Hard-Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1. 500–506.
[6]
H Russell Bernard, Amber Wutich, and Gery W Ryan. 2016. Analyzing qualitative data: Systematic approaches. SAGE publications.
[7]
Andrea J Bingham and Patricia Witkowsky. 2021. Deductive and inductive approaches to qualitative data analysis. Analyzing and interpreting qualitative data: After the interview (2021), 133–146.
[8]
Joel Brandt, Philip J Guo, Joel Lewenstein, Mira Dontcheva, and Scott R Klemmer. 2009. Two studies of opportunistic programming: interleaving web foraging, learning, and writing code. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1589–1598.
[9]
Jerome Seymour Bruner 1966. Toward a theory of instruction. Vol. 59. Harvard University Press.
[10]
Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z Gajos. 2021. To trust or to think: cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 1–21.
[11]
Shiye Cao and Chien-Ming Huang. 2022. Understanding User Reliance on AI in Assisted Decision-Making. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1–23.
[12]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).
[13]
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, 2022. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022).
[14]
Kathryn Cunningham, Barbara J Ericson, Rahul Agrawal Bejarano, and Mark Guzdial. 2021. Avoiding the Turing tarpit: Learning conversational programming by starting from code’s purpose. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.
[15]
Paul Denny, Viraj Kumar, and Nasser Giacaman. 2023. Conversing with copilot: Exploring prompt engineering for solving cs1 problems using natural language. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1. 1136–1142.
[16]
Paul Denny, Juho Leinonen, James Prather, Andrew Luxton-Reilly, Thezyrie Amarouche, Brett A Becker, and Brent N Reeves. 2023. Promptly: Using Prompt Problems to Teach Learners How to Effectively Utilize AI Code Generators. arXiv preprint arXiv:2307.16364 (2023).
[17]
Paul Denny, Sami Sarsa, Arto Hellas, and Juho Leinonen. 2022. Robosourcing Educational Resources–Leveraging Large Language Models for Learnersourcing. arXiv preprint arXiv:2211.04715 (2022).
[18]
James Finnie-Ansley, Paul Denny, Brett A Becker, Andrew Luxton-Reilly, and James Prather. 2022. The robots are coming: Exploring the implications of openai codex on introductory programming. In Australasian Computing Education Conference. 10–19.
[19]
James Finnie-Ansley, Paul Denny, Andrew Luxton-Reilly, Eddie Antonio Santos, James Prather, and Brett A Becker. 2023. My AI Wants to Know if This Will Be on the Exam: Testing OpenAI’s Codex on CS2 Programming Exercises. In Proceedings of the 25th Australasian Computing Education Conference. 97–104.
[20]
Krzysztof Z Gajos and Lena Mamykina. 2022. Do people engage cognitively with ai? impact of ai assistance on incidental learning. In 27th International Conference on Intelligent User Interfaces. 794–806.
[21]
Github. 2022. Copilot: Your AI pair programmer. https://github.com/features/copilot. [Online; accessed 9-September-2022].
[22]
Xinying Hou, Barbara Jane Ericson, and Xu Wang. 2022. Using Adaptive Parsons Problems to Scaffold Write-Code Problems. In Proceedings of the 2022 ACM Conference on International Computing Education Research-Volume 1. 15–26.
[23]
Michelle Ichinco and Caitlin Kelleher. 2015. Exploring novice programmer example use. In 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 63–71.
[24]
Naman Jain, Skanda Vaidyanath, Arun Iyer, Nagarajan Natarajan, Suresh Parthasarathy, Sriram Rajamani, and Rahul Sharma. 2022. Jigsaw: Large language models meet program synthesis. In Proceedings of the 44th International Conference on Software Engineering. 1219–1231.
[25]
Ellen Jiang, Edwin Toh, Alejandra Molina, Kristen Olson, Claire Kayacik, Aaron Donsbach, Carrie J Cai, and Michael Terry. 2022. Discovering the syntax and strategies of natural language programming with generative language models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–19.
[26]
Wei Jin, Albert Corbett, Will Lloyd, Lewis Baumstark, and Christine Rolka. 2014. Evaluation of guided-planning and assisted-coding with task relevant dynamic hinting. In Intelligent Tutoring Systems: 12th International Conference, ITS 2014, Honolulu, HI, USA, June 5-9, 2014. Proceedings 12. Springer, 318–328.
[27]
Enkelejda Kasneci, Kathrin Seßler, Stefan Küchemann, Maria Bannert, Daryna Dementieva, Frank Fischer, Urs Gasser, Georg Groh, Stephan Günnemann, Eyke Hüllermeier, 2023. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences 103 (2023), 102274.
[28]
Majeed Kazemitabaar, Justin Chow, Carl Ka To Ma, Barbara J Ericson, David Weintrop, and Tovi Grossman. 2023. Studying the effect of AI Code Generators on Supporting Novice Learners in Introductory Programming. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–23.
[29]
Hieke Keuning, Johan Jeuring, and Bastiaan Heeren. 2018. A systematic literature review of automated feedback generation for programming exercises. ACM Transactions on Computing Education (TOCE) 19, 1 (2018), 1–43.
[30]
Kyungbin Kwon and Jongpil Cheon. 2019. Exploring Problem Decomposition and Program Development through Block-Based Programs.International Journal of Computer Science Education in Schools 3, 1 (2019), n1.
[31]
Juho Leinonen, Arto Hellas, Sami Sarsa, Brent Reeves, Paul Denny, James Prather, and Brett A Becker. 2023. Using large language models to enhance programming error messages. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1. 563–569.
[32]
Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, 2022. Competition-level code generation with alphacode. Science 378, 6624 (2022), 1092–1097.
[33]
Michael Xieyang Liu, Advait Sarkar, Carina Negreanu, Benjamin Zorn, Jack Williams, Neil Toronto, and Andrew D Gordon. 2023. “What It Wants Me To Say”: Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–31.
[34]
Samiha Marwan, Anay Dombe, and Thomas W Price. 2020. Unproductive help-seeking in programming: What it is and how to address it. In Proceedings of the 2020 ACM conference on innovation and technology in computer science education. 54–60.
[35]
Matthew B Miles and A Michael Huberman. 1994. Qualitative data analysis: An expanded sourcebook. sage.
[36]
Hussein Mozannar, Gagan Bansal, Adam Fourney, and Eric Horvitz. 2022. Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming. arXiv preprint arXiv:2210.14306 (2022).
[37]
Kimberly A Neuendorf. 2017. The content analysis guidebook. sage.
[38]
Samir Passi and Mihaela Vorvoreanu. 2022. Overreliance on AI: Literature review. (2022).
[39]
Tung Phung, José Cambronero, Sumit Gulwani, Tobias Kohn, Rupak Majumdar, Adish Singla, and Gustavo Soares. 2023. Generating High-Precision Feedback for Programming Syntax Errors using Large Language Models. arXiv preprint arXiv:2302.04662 (2023).
[40]
James Prather, Brett A Becker, Michelle Craig, Paul Denny, Dastyni Loksa, and Lauren Margulieux. 2020. What do we think we think we are doing? Metacognition and self-regulation in programming. In Proceedings of the 2020 ACM conference on international computing education research. 2–13.
[41]
James Prather, Brent N. Reeves, Paul Denny, Brett A. Becker, Juho Leinonen, Andrew Luxton-Reilly, Garrett Powell, James Finnie-Ansley, and Eddie Antonio Santos. 2023. “It’s Weird That It Knows What I Want”: Usability and Interactions with Copilot for Novice Programmers. ACM Trans. Comput.-Hum. Interact. (aug 2023). https://doi.org/10.1145/3617367 Just Accepted.
[42]
Thomas W Price, Rui Zhi, and Tiffany Barnes. 2017. Hint generation under uncertainty: The effect of hint quality on help-seeking behavior. In Artificial Intelligence in Education: 18th International Conference, AIED 2017, Wuhan, China, June 28–July 1, 2017, Proceedings 18. Springer, 311–322.
[43]
Minna Puustinen. 1998. Help-seeking behavior in a problem-solving situation: Development of self-regulation. European Journal of Psychology of education 13 (1998), 271–282.
[44]
Kelly Rivers and Kenneth R Koedinger. 2017. Data-driven hint generation in vast solution spaces: a self-improving python programming tutor. International Journal of Artificial Intelligence in Education 27 (2017), 37–64.
[45]
Steven I Ross, Fernando Martinez, Stephanie Houde, Michael Muller, and Justin D Weisz. 2023. The programmer’s assistant: Conversational interaction with a large language model for software development. In Proceedings of the 28th International Conference on Intelligent User Interfaces. 491–514.
[46]
Advait Sarkar, Andrew D Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, and Ben Zorn. 2022. What is it like to program with artificial intelligence?arXiv preprint arXiv:2208.06213 (2022).
[47]
Sami Sarsa, Paul Denny, Arto Hellas, and Juho Leinonen. 2022. Automatic generation of programming exercises and code explanations using large language models. In Proceedings of the 2022 ACM Conference on International Computing Education Research-Volume 1. 27–43.
[48]
Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated feedback generation for introductory programming assignments. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation. 15–26.
[49]
Renske Smetsers-Weeda and Sjaak Smetsers. 2017. Problem solving and algorithmic development with flowcharts. In Proceedings of the 12th Workshop on Primary and Secondary Computing Education. 25–34.
[50]
Edward R Sykes. 2010. Design, Development and Evaluation of the Java Intelligent Tutoring System.Technology, Instruction, Cognition & Learning 8, 1 (2010).
[51]
Ningzhi Tang, Meng Chen, Zheng Ning, Aakash Bansal, Yu Huang, Collin McMillan, and Toby Jia-Jun Li. 2023. An Empirical Study of Developer Behaviors for Validating and Repairing AI-Generated Code. Plateau Workshop.
[52]
Priyan Vaithilingam, Tianyi Zhang, and Elena L Glassman. 2022. Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models. In Chi conference on human factors in computing systems extended abstracts. 1–7.
[53]
Wengran Wang, Audrey Le Meur, Mahesh Bobbadi, Bita Akram, Tiffany Barnes, Chris Martens, and Thomas Price. 2022. Exploring Design Choices to Support Novices’ Example Use During Creative Open-Ended Programming. In Proceedings of the 53rd ACM Technical Symposium on Computer Science Education V. 1. 619–625.
[54]
Frank F Xu, Bogdan Vasilescu, and Graham Neubig. 2022. In-ide code generation from natural language: Promise and challenges. ACM Transactions on Software Engineering and Methodology (TOSEM) 31, 2 (2022), 1–47.
[55]
Zhen Xu, Albert D Ritzhaupt, Fengchun Tian, and Karthikeyan Umapathy. 2019. Block-based versus text-based programming environments on novice student learning outcomes: A meta-analysis study. Computer Science Education 29, 2-3 (2019), 177–204.

Cited By

View all
  • (2024)Demystifying Machine Learning: Applications in African Environmental Science and EngineeringEuropean Journal of Theoretical and Applied Sciences10.59324/ejtas.2024.2(3).532:3(688-705)Online publication date: 1-May-2024
  • (2024)Cognitive Apprenticeship and Artificial Intelligence Coding AssistantsNavigating Computer Science Education in the 21st Century10.4018/979-8-3693-1066-3.ch013(261-281)Online publication date: 26-Feb-2024
  • (2024)Guidelines for Effective Use of ChatGPT in Introductory Programming Education2024 IST-Africa Conference (IST-Africa)10.23919/IST-Africa63983.2024.10569684(1-8)Online publication date: 20-May-2024
  • Show More Cited By

Index Terms

  1. How Novices Use LLM-based Code Generators to Solve CS1 Coding Tasks in a Self-Paced Learning Environment

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      Koli Calling '23: Proceedings of the 23rd Koli Calling International Conference on Computing Education Research
      November 2023
      361 pages
      ISBN:9798400716539
      DOI:10.1145/3631802
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 February 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. ChatGPT
      2. Copilot
      3. Introductory Programming
      4. Large Language Models
      5. OpenAI Codex
      6. Self-paced Learning
      7. Self-regulation

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      Koli Calling '23

      Acceptance Rates

      Overall Acceptance Rate 80 of 182 submissions, 44%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)786
      • Downloads (Last 6 weeks)159
      Reflects downloads up to 16 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Demystifying Machine Learning: Applications in African Environmental Science and EngineeringEuropean Journal of Theoretical and Applied Sciences10.59324/ejtas.2024.2(3).532:3(688-705)Online publication date: 1-May-2024
      • (2024)Cognitive Apprenticeship and Artificial Intelligence Coding AssistantsNavigating Computer Science Education in the 21st Century10.4018/979-8-3693-1066-3.ch013(261-281)Online publication date: 26-Feb-2024
      • (2024)Guidelines for Effective Use of ChatGPT in Introductory Programming Education2024 IST-Africa Conference (IST-Africa)10.23919/IST-Africa63983.2024.10569684(1-8)Online publication date: 20-May-2024
      • (2024)Chain of Targeted Verification Questions to Improve the Reliability of Code Generated by LLMsProceedings of the 1st ACM International Conference on AI-Powered Software10.1145/3664646.3664772(122-130)Online publication date: 10-Jul-2024
      • (2024)Large Language Models Can Connect the Dots: Exploring Model Optimization Bugs with Domain Knowledge-Aware PromptsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680383(1579-1591)Online publication date: 11-Sep-2024
      • (2024)Insights from Social Shaping Theory: The Appropriation of Large Language Models in an Undergraduate Programming CourseProceedings of the 2024 ACM Conference on International Computing Education Research - Volume 110.1145/3632620.3671098(114-130)Online publication date: 12-Aug-2024
      • (2024)CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Student and Educator NeedsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642773(1-20)Online publication date: 11-May-2024
      • (2024)ChatGPT in Data Visualization Education: A Student Perspective2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL/HCC60511.2024.00022(109-120)Online publication date: 2-Sep-2024
      • (2024)Developer Behaviors in Validating and Repairing LLM-Generated Code Using IDE and Eye Tracking2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL/HCC60511.2024.00015(40-46)Online publication date: 2-Sep-2024
      • (2024)Anchor Your Embeddings Through the Storm: Mitigating Instance-to-Document Semantic Gap2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650518(1-8)Online publication date: 30-Jun-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media