Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3658271.3658342acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbsiConference Proceedingsconference-collections
research-article

Generating and Reviewing Programming Codes with Large Language Models: A Systematic Mapping Study

Published: 23 May 2024 Publication History

Abstract

Context: The proliferation of technologies based on Large Language Models (LLM) is reshaping various domains, also impacting on programming code creation and review. Problem: The decision-making process in adopting LLM in software development demands an understanding of associated challenges and diverse application possibilities. Solution: This study addresses the identified challenges linked to LLM utilization in programming code processes. It explores models, utilization strategies, challenges, and coping mechanisms, focusing on the perspectives of researchers in software development. IS Theory: Drawing on Task-Technology Fit (TTF) theory, the research examines the alignment between task characteristics in code generation and review, and LLM technology attributes to discern performance impacts and utilization patterns. Method: Employing the Systematic Mapping of the Literature method, the research analyzes 19 selected studies from digital databases—IEEE Digital Library, Compendex Engineering Village, and Scopus—out of 1,257 retrieved results. Summary of Results: The research reveals 23 models, 13 utilization strategies, 15 challenges, and 14 coping mechanisms associated with LLM in programming code processes, offering a comprehensive understanding of the application landscape. Contributions to IS: Contributing to the Information Systems (IS) field, This study provides valuable insights into the utilization of LLM in programming code generation and review. The identified models, strategies, challenges, and coping mechanisms offer practical guidance for decision-making processes related to LLM technology adoption. The research aims to support the IS community in effectively navigating the complexities of integrating large language models into the dynamic software development lifecycle.

References

[1]
Francis Alexander, Edwin Ario Abdiwijaya, Felix Pherry, Alexander Agung Santoso Gunawan, and Anderies. 2022. Systematic Literature Review on Solving Competitive Programming Problem with Artificial Intelligence (AI). In 2022 1st International Conference on Software Engineering and Information Technology (ICoSEIT). 85–90. https://doi.org/10.1109/ICoSEIT55604.2022.10029949
[2]
Aaqib Ahmed R. H. Ansari and Deepali R. Vora. 2022. NLI-GSC: A Natural Language Interface for Generating SourceCode. International Journal of Advanced Computer Science and Applications 13, 1 (2022), 842 – 853. http://dx.doi.org/10.14569/IJACSA.2022.0130198
[3]
Renata Mendes Araujo. 2017. Information systems and the open world challenges. SBC, Porto Alegre, RS, Brasil.
[4]
V. R. Basili, G. Caldiera, and H. D. Rombach. 1994. The Goal Question Metric approach.Encyclopedia of Software Engineering 1, 2 (1994).
[5]
Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q Feldman, Arjun Guha, Michael Greenberg, and Abhinav Jangda. 2023. MultiPL-E: A Scalable and Polyglot Approach to Benchmarking Neural Code Generation. IEEE Transactions on Software Engineering 49, 7 (2023), 3675–3691. https://doi.org/10.1109/TSE.2023.3267446
[6]
Matteo Ciniselli, Luca Pascarella, and Gabriele Bavota. 2022. To What Extent do Deep Learning-based Code Recommenders Generate Predictions by Cloning Code from the Training Set?. In 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR). 167–178. https://doi.org/10.1145/3524842.3528440
[7]
Juan Cruz-Benito, Sanjay Vishwakarma, Francisco Martin-Fernandez, and Ismael Faro. 2021. Automated Source Code Generation and Auto-Completion Using Deep Learning: Comparing and Discussing Current Language Model-Related Approaches. AI 2, 1 (2021), 1–16. https://doi.org/10.3390/ai2010001
[8]
Arghavan Moradi Dakhel, Vahid Majdinasab, Amin Nikanjam, Foutse Khomh, Michel C. Desmarais, and Zhen Ming (Jack) Jiang. 2023. GitHub Copilot AI pair programmer: Asset or Liability?Journal of Systems and Software 203 (2023), 111734. https://doi.org/10.1016/j.jss.2023.111734
[9]
Enrique Dehaerne, Bappaditya Dey, Sandip Halder, Stefan De Gendt, and Wannes Meert. 2022. Code Generation Using Machine Learning: A Systematic Review. IEEE Access 10 (2022), 82434–82455. https://doi.org/10.1109/ACCESS.2022.3196347
[10]
Yogesh K. Dwivedi et al.2023. Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. International Journal of Information Management 71 (2023), 102642. https://doi.org/10.1016/j.ijinfomgt.2023.102642
[11]
Yunhe Feng, Sreecharan Vanam, Manasa Cherukupally, Weijian Zheng, Meikang Qiu, and Haihua Chen. 2023. Investigating Code Generation Performance of ChatGPT with Crowdsourcing Social Data. In 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC). 876–885. https://doi.org/10.1109/COMPSAC57700.2023.00117
[12]
Dale L Goodhue. 1995. Understanding user evaluations of information systems. Management science 41, 12 (1995), 1827–1844.
[13]
Dale L Goodhue and Ronald L Thompson. 1995. Task-technology fit and individual performance. MIS quarterly (1995), 213–236.
[14]
Anjan Karmakar and Romain Robbes. 2021. What do pre-trained code models know about code?. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1332–1336. https://doi.org/10.1109/ASE51524.2021.9678927
[15]
Barbara Kitchenham and Stuart Charters. 2007. Guidelines for performing systematic literature reviews in software engineering. (2007).
[16]
Sergey Kovalchuk, Dmitriy Fedrushkov, Vadim Lomshakov, and Artem Aliev. 2023. Test-based and metric-based evaluation of code generation models for practical question answering. In 2023 International Conference on Code Quality (ICCQ). 73–86. https://doi.org/10.1109/ICCQ57276.2023.10114665
[17]
Sila Lertbanjongngam, Bodin Chinthanet, Takashi Ishio, Raula Gaikovina Kula, Pattara Leelaprute, Bundit Manaskasemsak, Arnon Rungsawang, and Kenichi Matsumoto. 2022. An Empirical Evaluation of Competitive Programming AI: A Case Study of AlphaCode. In 2022 IEEE 16th International Workshop on Software Clones (IWSC). 10–15. https://doi.org/10.1109/IWSC55060.2022.00010
[18]
Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, and Oriol Vinyals. 2022. Competition-level code generation with AlphaCode. Science 378, 6624 (2022), 1092–1097. https://doi.org/10.1126/science.abq1158 arXiv:https://www.science.org/doi/pdf/10.1126/science.abq1158
[19]
K. Martineau. 2023. What is generative AI? (2023). https://research.ibm.com/blog/what-is-generative-AI
[20]
Antonio Mastropaolo, Luca Pascarella, Emanuela Guglielmi, Matteo Ciniselli, Simone Scalabrino, Rocco Oliveto, and Gabriele Bavota. 2023. On the Robustness of Code Generation Techniques: An Empirical Study on GitHub Copilot. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 2149–2160. https://doi.org/10.1109/ICSE48619.2023.00181
[21]
Matthew J Page, Joanne E McKenzie, Patrick M Bossuyt, Isabelle Boutron, Tammy C Hoffmann, Cynthia D Mulrow, Larissa Shamseer, Jennifer M Tetzlaff, Elie A Akl, Sue E Brennan, 2021. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Systematic reviews 10, 1 (2021), 1–11.
[22]
Hammond Pearce, Benjamin Tan, Baleegh Ahmad, Ramesh Karri, and Brendan Dolan-Gavitt. 2023. Examining Zero-Shot Vulnerability Repair with Large Language Models. In 2023 IEEE Symposium on Security and Privacy (SP). 2339–2356. https://doi.org/10.1109/SP46215.2023.10179324
[23]
Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz. 2015. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and software technology 64 (2015), 1–18.
[24]
Ilham Qasse, Shailesh Mishra, Björn þór Jónsson, Foutse Khomh, and Mohammad Hamdaqa. 2023. Chat2Code: A Chatbot for Model Specification and Code Generation, The Case of Smart Contracts. In 2023 IEEE International Conference on Software Services Engineering (SSE). 50–60. https://doi.org/10.1109/SSE60056.2023.00018
[25]
Qureshi. R., D. Shaughnessy, K.A.R. Gill, K.A. Robinson, T. Li, and E. Agai. 2023. Are ChatGPT and large language models “the answer” to bringing us closer to systematic review automation?Systematic reviews 12, 72 (2023).
[26]
J.S. Sichman. 2021. Inteligência Artificial e sociedade: avanços e riscos.Estudos avançados. Edição especial: Inteligência Artificial. 35, 101 (2021). https://doi.org/10.1590/s0103-4014.2021.35101.004
[27]
Mohammed Latif Siddiq, Shafayat H. Majumder, Maisha R. Mim, Sourov Jajodia, and Joanna C. S. Santos. 2022. An Empirical Study of Code Smells in Transformer-based Code Generation Techniques. In 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM). 71–82. https://doi.org/10.1109/SCAM55253.2022.00014
[28]
Mohammed Latif Siddiq, Abdus Samee, Sk Ruhul Azgor, Md. Asif Haider, Shehabul Islam Sawraz, and Joanna C. S. Santos. 2023. Zero-shot Prompting for Code Complexity Prediction Using GitHub Copilot. In 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE). 56–59. https://doi.org/10.1109/NLBSE59153.2023.00018
[29]
Zhensu Sun, Xiaoning Du, Fu Song, Shangwen Wang, Mingze Ni, and Li Li. 2023. Don’t Complete It! Preventing Unhelpful Code Completion for Productive and Sustainable Neural Code Completion Systems. In 2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 324–325. https://doi.org/10.1109/ICSE-Companion58688.2023.00089
[30]
Priyan Vaithilingam, Elena L. Glassman, Peter Groenwegen, Sumit Gulwani, Austin Z. Henley, Rohan Malpani, David Pugh, Arjun Radhakrishna, Gustavo Soares, Joey Wang, and Aaron Yim. 2023. Towards More Effective AI-Assisted Programming: A Systematic Design Exploration to Improve Visual Studio IntelliCode’s User Experience. In 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 185–195. https://doi.org/10.1109/ICSE-SEIP58684.2023.00022
[31]
Tim van Dam, Maliheh Izadi, and Arie van Deursen. 2023. Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study. In 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR). 170–182. https://doi.org/10.1109/MSR59073.2023.00035
[32]
Yao Wan, Wei Zhao, Hongyu Zhang, Yulei Sui, Guandong Xu, and Hai Jin. 2022. What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code. In 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). 2377–2388. https://doi.org/10.1145/3510003.3510050
[33]
Man-Fai Wong, Shangxin Guo, Ching-Nam Hang, Siu-Wai Ho, and Chee-Wei Tan. 2023. Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review. Entropy 25, 6 (2023). https://doi.org/10.3390/e25060888
[34]
Chen Yang, Yan Liu, and Changqing Yin. 2021. Recent Advances in Intelligent Source Code Generation: A Survey on Natural Language Based Studies. Entropy 23, 9 (2021). https://doi.org/10.3390/e23091174
[35]
Ilze Zigurs and Bonnie K Buckland. 1998. A theory of task/technology fit and group support systems effectiveness. MIS quarterly (1998), 313–334.

Index Terms

  1. Generating and Reviewing Programming Codes with Large Language Models: A Systematic Mapping Study

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      SBSI '24: Proceedings of the 20th Brazilian Symposium on Information Systems
      May 2024
      708 pages
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 May 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Code Generation
      2. LLM
      3. automatic refactoring
      4. code auto-suggestion
      5. code completion
      6. natural language models
      7. neural network
      8. systematic mapping study
      9. transformer architecture

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      SBSI '24
      SBSI '24: XX Brazilian Symposium on Information Systems
      May 20 - 23, 2024
      Juiz de Fora, Brazil

      Acceptance Rates

      Overall Acceptance Rate 181 of 557 submissions, 32%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 58
        Total Downloads
      • Downloads (Last 12 months)58
      • Downloads (Last 6 weeks)23
      Reflects downloads up to 18 Aug 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media