research-article

Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book

Authors:

Stephen MacNeil,

Seth Bernstein,

Juho LeinonenAuthors Info & Claims

SIGCSE 2023: Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1

Pages 931 - 937

https://doi.org/10.1145/3545945.3569785

Published: 03 March 2023 Publication History

Abstract

Advances in natural language processing have resulted in large language models (LLMs) that can generate code and code explanations. In this paper, we report on our experiences generating multiple code explanation types using LLMs and integrating them into an interactive e-book on web software development. Three different types of explanations -- a line-by-line explanation, a list of important concepts, and a high-level summary of the code -- were created. Students could view explanations by clicking a button next to code snippets, which showed the explanation and asked about its utility. Our results show that all explanation types were viewed by students and that the majority of students perceived the code explanations as helpful to them. However, student engagement varied by code snippet complexity, explanation type, and code snippet length. Drawing on our experiences, we discuss future directions for integrating explanations generated by LLMs into CS classrooms.

References

[1]

Kirsti M Ala-Mutka. 2005. A survey of automated assessment approaches for programming assignments. Computer science education, Vol. 15, 2 (2005), 83--102.

[2]

Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, and Charles Sutton. 2021. Program Synthesis with Large Language Models. arxiv: 2108.07732 [cs.PL]

[3]

Shraddha Barke, Michael B James, and Nadia Polikarpova. 2022. Grounded Copilot: How Programmers Interact with Code-Generating Models. arXiv preprint arXiv:2206.15000 (2022).

[4]

Joseph E Beck and Yue Gong. 2013. Wheel-spinning: Students who fail to master a skill. In International conf. on artificial intelligence in education. Springer, 431--440.

[5]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in neural information processing systems. 1877--1901.

[6]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).

[7]

Kathryn Cunningham, Yike Qiao, Alex Feng, and Eleanor O'Rourke. 2022. Bringing "High-Level" Down to Earth: Gaining Clarity in Conversational Programmer Learning Goals. In Proc. of the 53rd ACM Technical Symposium on Computer Science Education V. 1. ACM, New York, NY, USA, 551--557.

Digital Library

[8]

Paul Denny, John Hamer, Andrew Luxton-Reilly, and Helen Purchase. 2008. PeerWise: students sharing their multiple choice questions. In Proceedings of the fourth international workshop on computing education research. 51--58.

Digital Library

[9]

Paul Denny, Andrew Luxton-Reilly, Ewan Tempero, and Jacob Hendrickx. 2011. Codewrite: supporting student-driven practice of java. In Proceedings of the 42nd ACM technical symposium on Computer science education. 471--476.

Digital Library

[10]

Pierpaolo Dondio and Suha Shaheen. 2019. Is StackOverflow an Effective Complement to Gaining Practical Knowledge Compared to Traditional Computer Science Learning?. In Proceedings of the 2019 11th International Conf. on Education Technology and Computers. 132--138.

Digital Library

[11]

James Finnie-Ansley, Paul Denny, Brett A Becker, Andrew Luxton-Reilly, and James Prather. 2022. The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. In Australasian Computing Education Conf. 10--19.

[12]

Tianyu Gao, Adam Fisch, and Danqi Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conf. on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 3816--3830.

[13]

Katy Ilonka Gero, Vivian Liu, and Lydia Chilton. 2022. Sparks: Inspiration for Science Writing Using Language Models. In Designing Interactive Systems Conf. (Virtual Event, Australia) (DIS '22). ACM, New York, NY, USA, 1002--1019.

[14]

Jean M. Griffin. 2016. Learning by Taking Apart: Deconstructing Code by Reading, Tracing, and Debugging. In Proc. of the 17th Annual Conf. on Information Technology Education. ACM, New York, NY, USA, 148--153.

Digital Library

[15]

Philip J Guo. 2013. Online python tutor: embeddable web-based program visualization for cs education. In Proc. of the 44th ACM technical symposium on Computer science education. 579--584.

Digital Library

[16]

Brian Hanks, Sue Fitzgerald, Renée McCauley, Laurie Murphy, and Carol Zander. 2011. Pair programming in education: a literature review. Computer Science Education, Vol. 21, 2 (2011), 135--173. https://doi.org/10.1080/08993408.2011.579808

[17]

Andrew Head, Codanda Appachu, Marti A Hearst, and Björn Hartmann. 2015. Tutorons: Generating context-relevant, on-demand explanations and demonstrations of online code. In 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 3--12.

[18]

Petri Ihantola, Tuukka Ahoniemi, Ville Karavirta, and Otto Seppälä. 2010. Review of recent systems for automatic assessment of programming assignments. In Proc. of the 10th Koli calling international conf. on computing education research.

Digital Library

[19]

Cazembe Kennedy, Aubrey Lawson, Yvon Feaster, and Eileen Kraemer. 2020. Misconception-Based Peer Feedback: A Pedagogical Technique for Reducing Misconceptions. In Proceedings of the 2020 ACM Conf. on Innovation and Technology in Computer Science Education (Trondheim, Norway) (ITiCSE '20). ACM, New York, NY, USA, 166--172.

Digital Library

[20]

Amy J Ko and Brad A Myers. 2004. Designing the whyline: a debugging interface for asking questions about program behavior. In Proceedings of the SIGCHI conference on Human factors in computing systems. 151--158.

Digital Library

[21]

Juho Leinonen, Nea Pirttinen, and Arto Hellas. 2020. Crowdsourcing Content Creation for SQL Practice. In Proceedings of the 2020 ACM Conf. on Innovation and Technology in Computer Science Education. 349--355.

Digital Library

[22]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021).

Digital Library

[23]

Stephen MacNeil, Parth Patel, and Benjamin E Smolin. 2022a. Expert Goggles: Detecting and Annotating Visualizations using a Machine Learning Classifier. In The Adjunct Publication of the 35th Annual ACM Symposium on User Interface Software and Technology. 1--3. https://doi.org/10.1145/3526114.3558627

Digital Library

[24]

Stephen MacNeil, Andrew Tran, Dan Mogil, Seth Bernstein, Erin Ross, and Ziheng Huang. 2022b. Generating Diverse Code Explanations Using the GPT-3 Large Language Model. In Proceedings of the 2022 ACM Conference on International Computing Education Research - Volume 2. ACM, New York, NY, USA, 37--39. https://doi.org/10.1145/3501709.3544280

Digital Library

[25]

Samiha Marwan, Ge Gao, Susan Fisk, Thomas W. Price, and Tiffany Barnes. 2020. Adaptive Immediate Feedback Can Improve Novice Programming Engagement and Intention to Persist in Computer Science. In Proc. of the 2020 ACM Conf. on International Computing Education Research. ACM, New York, NY, USA, 194--203.

Digital Library

[26]

Samiha Marwan, Nicholas Lytle, Joseph Jay Williams, and Thomas Price. 2019. The Impact of Adding Textual Explanations to Next-Step Hints in a Novice Programming Environment. In Proceedings of the 2019 ACM Conf. on Innovation and Technology in Computer Science Education. ACM, New York, NY, USA, 520--526.

Digital Library

[27]

Laurie Murphy, Sue Fitzgerald, Raymond Lister, and Renée McCauley. 2012. Ability to 'explain in Plain English' Linked to Proficiency in Computer-Based Programming. In Proceedings of the Conf. on International Computing Education Research (ICER '12). 111--118. https://doi.org/10.1145/2361276.2361299

Digital Library

[28]

Niko Myller, Roman Bednarik, Erkki Sutinen, and Mordechai Ben-Ari. 2009. Extending the engagement taxonomy: Software visualization and collaborative learning. ACM Transactions on Computing Education (TOCE), Vol. 9, 1 (2009), 1--27.

Digital Library

[29]

Greg L Nelson, Benjamin Xie, and Amy J Ko. 2017. Comprehension first: evaluating a novel pedagogy and tutoring system for program tracing in CS1. In Proc. of the 2017 ACM conf. on international computing education research. 2--11.

Digital Library

[30]

John C Nesbit, Olusola O Adesope, Qing Liu, and Wenting Ma. 2014. How effective are intelligent tutoring systems in computer science education?. In 2014 IEEE 14th international conference on advanced learning technologies. IEEE, 99--103.

Digital Library

[31]

José Carlos Paiva, José Paulo Leal, and Álvaro Figueira. 2022. Automated Assessment in Computer Science Education: A State-of-the-Art Review. ACM Transactions on Computing Education (TOCE) (2022).

Digital Library

[32]

Chris Piech, Mehran Sahami, Jonathan Huang, and Leonidas Guibas. 2015. Autonomously Generating Hints by Inferring Problem Solving Policies. In Proc. of the Second (2015) ACM Conf. on Learning @ Scale. ACM, 195--204.

Digital Library

[33]

Nea Pirttinen, Vilma Kangas, Irene Nikkarinen, Henrik Nygren, Juho Leinonen, and Arto Hellas. 2018. Crowdsourcing programming assignments with CrowdSorcerer. In Proceedings of the 23rd Annual ACM Conf. on Innovation and Technology in Computer Science Education. 326--331.

Digital Library

[34]

Jaanus Pöial. 2020. Challenges of Teaching Programming in StackOverflow Era. In International Conf. on Interactive Collaborative Learning. Springer, 703--710.

[35]

Thomas W Price, Yihuan Dong, and Dragan Lipovac. 2017. iSnap: towards intelligent tutoring in novice programming environments. In Proceedings of the 2017 ACM SIGCSE Technical Symposium on computer science education. 483--488.

Digital Library

[36]

Ruixiang Qi and Davide Fossati. 2020. Unlimited Trace Tutor: Learning Code Tracing With Automatically Generated Programs. ACM, 427--433.

[37]

Sami Sarsa, Paul Denny, Arto Hellas, and Juho Leinonen. 2022. Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models. In Proceedings of the 2022 ACM Conf. on International Computing Education Research - Volume 1. ACM, New York, NY, USA, 27--43.

Digital Library

[38]

Dominik Sobania, Martin Briesch, and Franz Rothlauf. 2021. Choose Your Programming Copilot: A Comparison of the Program Synthesis Performance of GitHub Copilot and Genetic Programming. https://doi.org/10.48550/ARXIV.2111.07875

[39]

Anaïs Tack and Chris Piech. 2022. The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues. arXiv preprint arXiv:2205.07540 (2022).

[40]

Zahid Ullah, Adidah Lajis, Mona Jamjoom, Abdulrahman Altalhi, Abdullah Al-Ghamdi, and Farrukh Saleem. 2018. The effect of automatic assessment on novice programming: Strengths and limitations of existing systems. Computer Applications in Engineering Education, Vol. 26, 6 (2018), 2328--2341.

[41]

Priyan Vaithilingam, Tianyi Zhang, and Elena L Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In CHI Conf. on Human Factors in Computing Systems Extended Abstracts. 1--7.

Digital Library

[42]

Arto Vihavainen, Craig S Miller, and Amber Settle. 2015. Benefits of self-explanation in introductory programming. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education. 284--289.

Digital Library

[43]

Wengran Wang, Yudong Rao, Rui Zhi, Samiha Marwan, Ge Gao, and Thomas W. Price. 2020. Step Tutor: Supporting Students through Step-by-Step Example-Based Feedback. In Proceedings of the 2020 ACM Conf. on Innovation and Technology in Computer Science Education. ACM, New York, NY, USA, 391--397.

[44]

Jacqueline L. Whalley, Raymond Lister, Errol Thompson, Tony Clear, Phil Robbins, P. K. Ajith Kumar, and Christine Prasad. 2006. An Australasian Study of Reading and Comprehension Skills in Novice Programmers, Using the Bloom and SOLO Taxonomies. In Proc. of the 8th Australasian Conf. on Computing Education - Volume 52. Australian Computer Society, Inc., AUS, 243--252.

[45]

Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. In 27th International Conf. on Intelligent User Interfaces. 841--852.

Cited By

Prather JLeinonen JKiesler NGorson Benario JLau SMacNeil SNorouzi NOpel SPettit VPorter LReeves BSavelka JSmith DStrickroth SZingaro DMonga MLonati VBarendsen ESheard JPaterson J(2025)Beyond the Hype: A Comprehensive Review of Current Trends in Generative AI Research, Teaching Practices, and Tools2024 Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3689187.3709614(300-338)Online publication date: 22-Jan-2025
https://dl.acm.org/doi/10.1145/3689187.3709614
Doyle ASridhar PAgarwal ASavelka JSakr M(2025) A comparative study of AI ‐generated and human‐crafted learning objectives in computing education Journal of Computer Assisted Learning10.1111/jcal.1309241:1Online publication date: 5-Jan-2025
https://doi.org/10.1111/jcal.13092
Black GRimal BVaidyan V(2025)Balancing Security and Correctness in Code Generation: An Empirical Study on Commercial Large Language ModelsIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.34466959:1(419-430)Online publication date: Feb-2025
https://doi.org/10.1109/TETCI.2024.3446695
Show More Cited By

Index Terms

Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation
2. Social and professional topics
  1. Professional topics
    1. Computing education

Recommendations

Comparing Code Explanations Created by Students and Large Language Models
ITiCSE 2023: Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1

Reasoning about code and explaining its purpose are fundamental skills for computer scientists. There has been extensive research in the field of computing education on the relationship between a student's ability to explain code and other skills such as ...
Evaluating Explanations for Software Patches Generated by Large Language Models
Search-Based Software Engineering
Abstract
Large language models (LLMs) have recently been integrated in a variety of applications including software engineering tasks. In this work, we study the use of LLMs to enhance the explainability of software patches. In particular, we evaluate the ...
Generating Diverse Code Explanations using the GPT-3 Large Language Model
ICER '22: Proceedings of the 2022 ACM Conference on International Computing Education Research - Volume 2

Good explanations are essential to efficiently learning introductory programming concepts [10]. To provide high-quality explanations at scale, numerous systems automate the process by tracing the execution of code [8, 12], defining terms [9], giving ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGCSE 2023: Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1

March 2023

1481 pages

ISBN:9781450394314

DOI:10.1145/3545945

General Chairs:
Maureen Doyle
Northern Kentucky University, USA
,
Ben Stephenson
University of Calgary, Canada
,
Program Chairs:
Brian Dorn
University of Nebraska at Omaha, USA
,
Leen-Kiat Soh
University of Nebraska-Lincoln, USA
,
Lina Battestilli
North Carolina State University, USA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCSE: ACM Special Interest Group on Computer Science Education

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 March 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

SIGCSE 2023

Sponsor:

SIGCSE

SIGCSE 2023: The 54th ACM Technical Symposium on Computer Science Education

March 15 - 18, 2023

Toronto ON, Canada

Acceptance Rates

Overall Acceptance Rate 1,595 of 4,542 submissions, 35%

Upcoming Conference

SIGCSE TS 2025

Sponsor:
sigcse

The 56th ACM Technical Symposium on Computer Science Education

February 26 - March 1, 2025

Pittsburgh , PA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

82
Total Citations
View Citations
1,377
Total Downloads

Downloads (Last 12 months)643
Downloads (Last 6 weeks)42

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Prather JLeinonen JKiesler NGorson Benario JLau SMacNeil SNorouzi NOpel SPettit VPorter LReeves BSavelka JSmith DStrickroth SZingaro DMonga MLonati VBarendsen ESheard JPaterson J(2025)Beyond the Hype: A Comprehensive Review of Current Trends in Generative AI Research, Teaching Practices, and Tools2024 Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3689187.3709614(300-338)Online publication date: 22-Jan-2025
https://dl.acm.org/doi/10.1145/3689187.3709614
Doyle ASridhar PAgarwal ASavelka JSakr M(2025) A comparative study of AI ‐generated and human‐crafted learning objectives in computing education Journal of Computer Assisted Learning10.1111/jcal.1309241:1Online publication date: 5-Jan-2025
https://doi.org/10.1111/jcal.13092
Black GRimal BVaidyan V(2025)Balancing Security and Correctness in Code Generation: An Empirical Study on Commercial Large Language ModelsIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.34466959:1(419-430)Online publication date: Feb-2025
https://doi.org/10.1109/TETCI.2024.3446695
Mailach AGorgosch DSiegmund NSiegmund J(2025)“Ok Pal, we have to code that now”: interaction patterns of programming beginners with a conversational chatbotEmpirical Software Engineering10.1007/s10664-024-10561-630:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s10664-024-10561-6
Poitras ECrane BDempsey DBragg TSiegel ALin M(2024)Cognitive Apprenticeship and Artificial Intelligence Coding AssistantsNavigating Computer Science Education in the 21st Century10.4018/979-8-3693-1066-3.ch013(261-281)Online publication date: 26-Feb-2024
https://doi.org/10.4018/979-8-3693-1066-3.ch013
Nowrozy R(2024)GPTs or Grim Position Threats? The Potential Impacts of Large Language Models on Non-Managerial Jobs and Certifications in CybersecurityInformatics10.3390/informatics1103004511:3(45)Online publication date: 11-Jul-2024
https://doi.org/10.3390/informatics11030045
Humble N(2024)Risk management strategy for generative AI in computing education: how to handle the strengths, weaknesses, opportunities, and threats?International Journal of Educational Technology in Higher Education10.1186/s41239-024-00494-x21:1Online publication date: 11-Dec-2024
https://doi.org/10.1186/s41239-024-00494-x
Kiesler NScholz IAlbrecht JStappert FWienkop U(2024)Novice Learners of Programming and Generative AI - Prior Knowledge MattersProceedings of the 24th Koli Calling International Conference on Computing Education Research10.1145/3699538.3699580(1-2)Online publication date: 12-Nov-2024
https://dl.acm.org/doi/10.1145/3699538.3699580
Nazari AZhang YRaghothaman MChen H(2024)Localized Explanations for Automatically Synthesized Network ConfigurationsProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696888(52-59)Online publication date: 18-Nov-2024
https://dl.acm.org/doi/10.1145/3696348.3696888
Yang SBaird MO’Rourke EBrennan KSchneider B(2024)Decoding Debugging Instruction: A Systematic Literature Review of Debugging InterventionsACM Transactions on Computing Education10.1145/369065224:4(1-44)Online publication date: 5-Sep-2024
https://dl.acm.org/doi/10.1145/3690652
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten