Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3524459.3527350acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Towards JavaScript program repair with generative pre-trained transformer (GPT-2)

Published: 26 October 2022 Publication History

Abstract

The goal of Automated Program Repair (APR) is to find a fix to software bugs, without human intervention. The so-called Generate and Validate (G&V) approach deemed to be the most popular method in the last few years, where the APR tool creates a patch and it is validated against an oracle. Recent years for Natural Language Processing (NLP) were of great interest, with new pre-trained models shattering records on tasks ranging from sentiment analysis to question answering. Usually these deep learning models inspire the APR community as well. These approaches usually require a large dataset on which the model can be trained (or fine-tuned) and evaluated. The criterion to accept a patch depends on the underlying dataset, but usually the generated patch should be exactly the same as the one created by a human developer. As NLP models are more and more capable to form sentences, and the sentences will form coherent paragraphs, the APR tools are also better and better at generating syntactically and semantically correct source code. As the Generative Pre-trained Transformer (GPT) model is now available to everyone thanks to the NLP and AI research community, it can be fine-tuned to specific tasks (not necessarily on natural language). In this work we use the GPT-2 model to generate source code, to the best of our knowledge, the GPT-2 model was not used for Automated Program Repair so far. The model is fine-tuned for a specific task: it has been taught to fix JavaScript bugs automatically. To do so, we trained the model on 16863 JS code snippets, where it could learn the nature of the observed programming language. In our experiments we observed that the GPT-2 model was able to learn how to write syntactically correct source code almost on every attempt, although it failed to learn good bug-fixes in some cases. Nonetheless it was able to generate the correct fixes in most of the cases, resulting in an overall accuracy up to 17.25%.

References

[1]
Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified Pre-training for Program Understanding and Generation. (mar 2021), 2655--2668. arXiv:2103.06333
[2]
Ilya Sutskever Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei. 2020. [GPT-2] Language Models are Unsupervised Multitask Learners. OpenAI Blog 1, May (2020), 1--7.
[3]
Fatmah Yousef Assiri and James M. Bieman. 2017. Fault localization for automated program repair: effectiveness, performance, repair correctness. Software Quality Journal 25, 1 (mar 2017), 171--199.
[4]
SungGyeong Bae, Hyunghun Cho, Inho Lim, and Sukyoung Ryu. 2014. SAFEWAPI: Web API Misuse Detector for Web Applications. Association for Computing Machinery, New York, NY, USA. 507--517 pages.
[5]
Nicholas Carlini, Florian Tramèr, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom B. Brown, Dawn Xiaodong Song, Úlfar Erlingsson, Alina Oprea, and Colin Raffel. 2021. Extracting Training Data from Large Language Models. In USENIX Security Symposium.
[6]
Zimin Chen, Steve James Kommrusch, Michele Tufano, Louis Noel Pouchet, Denys Poshyvanyk, and Martin Monperrus. 2019. SEQUENCER: Sequence-to-Sequence Learning for End-to-End Program Repair. IEEE Transactions on Software Engineering 01 (sep 2019), 1--1. arXiv:1901.01808
[7]
Zimin Chen and Martin Monperrus. 2018. The CodRep Machine Learning on Source Code Competition. (2018). arXiv:1807.03200
[8]
Viktor Csuvik, Deniel Horvath, Ferenc Horvath, and Laszlo Vidacs. 2020. Utilizing Source Code Embeddings to Identify Correct Patches. In 2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF). IEEE, 18--25.
[9]
Elizabeth Dinella, Hanjun Dai, Google Brain, Ziyang Li, Mayur Naik, Le Song, Georgia Tech, and Ke Wang. 2020. Hoppity: Learning Graph Transformations To Detect and Fix Bugs in Programs. Technical Report. 1--17 pages.
[10]
Dawn Drain, Chen Wu, Alexey Svyatkovskiy, and Neel Sundaresan. 2021. Generating bug-fixes using pretrained transformers. MAPS 2021 - Proceedings of the 5th ACM SIGPLAN International Symposium on Machine Programming, co-located with PLDI 2021 (jun 2021), 1--8. arXiv:2104.07896
[11]
T. Durieux, Y. Hamadi, and M. Monperrus. 2018. Fully Automated HTML and Javascript Rewriting for Constructing a Self-Healing Web Proxy. In 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE). 1--12.
[12]
Katherine Elkins and Jon Chun. 2020. Can GPT-3 Pass a Writer's Turing Test? Journal of Cultural Analytics (sep 2020).
[13]
Michael Fischer, Martin Pinzger, and Harald Gall. 2003. Populating a Release History Database from Version Control and Bug Tracking Systems. IEEE International Conference on Software Maintenance, ICSM (2003), 23--32.
[14]
L. Gazzola, D. Micucci, and L. Mariani. 2019. Automatic Software Repair: A Survey. IEEE Transactions on Software Engineering 45, 1 (2019), 34--67.
[15]
Mariani Leonardo Gazzola Luca, Micucci Daniela. 2019. Automatic Software Repair: A Survey. IEEE Transactions on Software Engineering 45, 1 (jan 2019), 34--67.
[16]
GHArchive 2021. GH Archive Official Website. https://www.gharchive.org.
[17]
GitHub 2021. The 2020 State of the Octoverse. https://octoverse.github.com.
[18]
GitHub REST API 2021. GitHub REST API Official Website. https://docs.github.com/en/rest.
[19]
Peter Gyimesi, Bela Vancsics, Andrea Stocco, Davood Mazinanian, Arpad Beszedes, Rudolf Ferenc, and Ali Mesbah. 2019. BugsJS: A benchmark of javascript bugs. In Proceedings - 2019 IEEE 12th International Conference on Software Testing, Verification and Validation, ICST 2019. 90--101.
[20]
Masum Hasan, Kazi Sajeed Mehrab, Wasi Uddin Ahmad, and Rifat Shahriyar. 2021. Text2App: A Framework for Creating Android Apps from Text Descriptions. (2021). arXiv:2104.08301
[21]
Simon Holm Jensen, Peter A. Jonsson, and Anders Møller. 2012. Remedying the Eval That Men Do. In Proceedings of the 2012 International Symposium on Software Testing and Analysis (Minneapolis, MN, USA) (ISSTA 2012). Association for Computing Machinery, New York, NY, USA, 34--44.
[22]
Nan Jiang, Thibaud Lutellier, and Lin Tan. 2021. CURE: Code-Aware Neural Machine Translation for Automatic Program Repair. (may 2021), 1161--1173. arXiv:2103.00073
[23]
René Just, Darioush Jalali, and Michael D. Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In 2014 International Symposium on Software Testing and Analysis, ISSTA 2014 - Proceedings. Association for Computing Machinery, Inc, 437--440.
[24]
Rafael Michael Karampatsis and Charles Sutton. 2020. How Often Do Single-Statement Bugs Occur?: The ManySStuBs4J Dataset. Proceedings - 2020 IEEE/ACM 17th International Conference on Mining Software Repositories, MSR 2020 (may 2020), 573--577. arXiv:1905.13334
[25]
Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In Proceedings - International Conference on Software Engineering. IEEE, 802--811. arXiv:arXiv:1408.2103v1
[26]
Xuan Bach D. Le, David Lo, and Claire Le Goues. 2016. History Driven Program Repair. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 213--224.
[27]
Xuan Bach D. Le, Ferdian Thung, David Lo, and Claire Le Goues. 2018. Overfitting in semantics-based automated program repair. Empirical Software Engineering 23, 5 (oct 2018), 3007--3033.
[28]
Claire Le Goues, Neal Holtschulte, Edward K. Smith, Yuriy Brun, Premkumar Devanbu, Stephanie Forrest, and Westley Weimer. 2015. The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs. IEEE Transactions on Software Engineering 41, 12 (dec 2015), 1236--1256.
[29]
Fan Long and Martin Rinard. 2016. Automatic patch generation by learning correct code. Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages - POPL 2016 (2016), 298--312.
[30]
Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie Liu. 2021. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. undefined (2021). arXiv:2102.04664
[31]
Thibaud Lutellier, Hung Viet Pham, Lawrence Pang, Yitong Li, Moshi Wei, and Lin Tan. 2020. CoCoNuT: Combining context-aware neural translation models using ensemble for program repair. ISSTA 2020 - Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis 20 (2020), 101--114.
[32]
Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2017. Automatic repair of real bugs in java: a large-scale experiment on the defects4j dataset. Empirical Software Engineering 22, 4 (aug 2017), 1936--1964.
[33]
Matias Martinez and Martin Monperrus. 2016. ASTOR: A program repair library for Java (Demo). In ISSTA 2016 - Proceedings of the 25th International Symposium on Software Testing and Analysis. Association for Computing Machinery, Inc, New York, New York, USA, 441--444.
[34]
Martin Monperrus. 2020. The Living Review on Automated Program Repair. Technical Report.
[35]
Frolin S. Ocariza, Jr., Karthik Pattabiraman, and Ali Mesbah. 2014. Vejovis: suggesting fixes for JavaScript faults. In Proceedings of the 36th International Conference on Software Engineering - ICSE 2014. ACM Press, New York, New York, USA, 837--847.
[36]
Long Phan, Hieu Tran, Daniel Le, Hieu Nguyen, James Annibal, Alec Peltekian, and Yanfang Ye. 2021. CoTexT: Multi-task Learning with Code-Text Transformer. (may 2021), 40--47. arXiv:2105.08645
[37]
Zichao Qi, Fan Long, Sara Achour, and Martin Rinard. 2015. An Analysis of Patch Plausibility and Correctness for Generate-and-Validate Patch Generation Systems. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (Baltimore, MD, USA) (ISSTA 2015). Association for Computing Machinery, New York, NY, USA, 24--36.
[38]
Alec Radford, Tim Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. [GPT-1] Improving Language Understanding by Generative Pre-Training. Preprint (2018), 1--12.
[39]
Alec Radford, Jeffrey Wu, Dario Amodei, Jack Clark, Miles Brundage, Ilya Sutskever, Amanda Askell, David Lansky, Danny Hernandez, and David Luan. 2019. Better Language Models and Their Implications., 12 pages.
[40]
Ripon K. Saha, Yingjun Lyu, Wing Lam, Hiroaki Yoshida, and Mukul R. Prasad. 2018. Bugs.jar: A large-scale, diverse dataset of real-world Java bugs. Proceedings - International Conference on Software Engineering (2018), 10--13.
[41]
Stack Overflow 2021. Stack Overflow Developer Survey Results 2021. https://insights.stackoverflow.com/survey/2021.
[42]
Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Transactions on Software Engineering and Methodology 28, 4 (2019).
[43]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 2017-December (2017), 5999--6009. arXiv:1706.03762
[44]
Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically Finding Patches Using Genetic Programming. In Proceedings of the 31st International Conference on Software Engineering (ICSE '09). IEEE Computer Society, USA, 364--374.
[45]
Jifeng Xuan, Matias Martinez, Favio DeMarco, Maxime Clement, Sebastian Lamelas Marcote, Thomas Durieux, Daniel Le Berre, and Martin Monperrus. 2017. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs. IEEE Transactions on Software Engineering 43, 1 (2017), 34--55. arXiv:1807.00515
[46]
Li Yi, Shaohua Wang, and Tien N. Nguyen. 2020. Dlfix: Context-based code transformation learning for automated program repair. In Proceedings - International Conference on Software Engineering. IEEE Computer Society, 602--614.
[47]
Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh. 2021. Calibrate Before Use: Improving Few-Shot Performance of Language Models. (2021). arXiv:2102.09690
[48]
Yueting Zhuang, Ming Cai, Xuelong Li, Xiangang Luo, Qiang Yang, and Fei Wu. 2020. The Next Breakthroughs of Artificial Intelligence: The Interdisciplinary Nature of AI. Engineering 6, 3 (mar 2020), 245--247.

Cited By

View all
  • (2025)MPLinker: Multi-template Prompt-tuning with adversarial training for Issue–commit Link recoveryJournal of Systems and Software10.1016/j.jss.2025.112351223(112351)Online publication date: May-2025
  • (2024)AI-Assisted Programming Tasks Using Code Embeddings and TransformersElectronics10.3390/electronics1304076713:4(767)Online publication date: 15-Feb-2024
  • (2024)Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology10.1145/369598833:8(1-79)Online publication date: 20-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
APR '22: Proceedings of the Third International Workshop on Automated Program Repair
May 2022
83 pages
ISBN:9781450392853
DOI:10.1145/3524459
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GPT
  2. JavaScript
  3. automated program repair
  4. code refinement
  5. machine learning

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '22
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)134
  • Downloads (Last 6 weeks)8
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)MPLinker: Multi-template Prompt-tuning with adversarial training for Issue–commit Link recoveryJournal of Systems and Software10.1016/j.jss.2025.112351223(112351)Online publication date: May-2025
  • (2024)AI-Assisted Programming Tasks Using Code Embeddings and TransformersElectronics10.3390/electronics1304076713:4(767)Online publication date: 15-Feb-2024
  • (2024)Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology10.1145/369598833:8(1-79)Online publication date: 20-Sep-2024
  • (2024)Reality Check: Assessing GPT-4 in Fixing Real-World Software VulnerabilitiesProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661207(252-261)Online publication date: 18-Jun-2024
  • (2024)Automated Program Repair with the GPT Family, including GPT-2, GPT-3 and CodeXProceedings of the 5th ACM/IEEE International Workshop on Automated Program Repair10.1145/3643788.3648021(34-41)Online publication date: 20-Apr-2024
  • (2024)Software Testing With Large Language Models: Survey, Landscape, and VisionIEEE Transactions on Software Engineering10.1109/TSE.2024.336820850:4(911-936)Online publication date: 20-Feb-2024
  • (2024)Generating Headlines from Article Summaries Using Transformer Models2024 International Conference on Expert Clouds and Applications (ICOECA)10.1109/ICOECA62351.2024.00040(154-162)Online publication date: 18-Apr-2024
  • (2024)PyBugHive: A Comprehensive Database of Manually Validated, Reproducible Python BugsIEEE Access10.1109/ACCESS.2024.344910612(123739-123756)Online publication date: 2024
  • (2024)Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and TransformersIEEE Access10.1109/ACCESS.2024.339777512(69812-69837)Online publication date: 2024
  • (2024)How secure is AI-generated code: a large-scale comparison of large language modelsEmpirical Software Engineering10.1007/s10664-024-10590-130:2Online publication date: 21-Dec-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media