Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3671016.3674813acmconferencesArticle/Chapter ViewAbstractPublication PagesinternetwareConference Proceedingsconference-collections
research-article

Optimizing Search-Based Unit Test Generation with Large Language Models: An Empirical Study

Published: 24 July 2024 Publication History

Abstract

Search-based unit test generation methods have been considered effective and widely applied, and Large Language Models (LLMs) have also demonstrated their powerful generation ability. Therefore, some scholars have proposed using LLMs to enhance search-based unit test generation methods and have preliminarily confirmed that LLMs can help alleviate the problem of test coverage plateaus. However, it is still unclear when and how LLMs should intervene in the time-consuming test generation process. This paper explores the application of LLMs at various stages of search-based test generation (SBTG) (including the initial stage, the test generation period, and the test coverage plateaus), as well as strategies for controlling the frequency of LLM intervention. A comprehensive empirical study was conducted on 486 Python benchmark modules from 27 projects. The experimental results show that 1) LLM intervention has a positive effect at any stage, whether to improve coverage over a fixed period or to reduce the time to reach a specific coverage; 2) a reasonable intervention frequency is crucial for LLMs to have a positive effect on SBTG. This work can better help understand when and how LLMs should be applied in SBTG and provide valuable suggestions for developers in practice.

References

[1]
Nasser Albunian, Gordon Fraser, and Dirk Sudholt. 2020. Causes and effects of fitness landscapes in unit test generation. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference. 1204–1212.
[2]
Aldeida Aleti, Irene Moser, and Lars Grunske. 2017. Analysing the fitness landscape of search-based software testing problems. Automated Software Engineering 24 (2017), 603–621.
[3]
Shaukat Ali, Lionel C Briand, Hadi Hemmati, and Rajwinder Kaur Panesar-Walawege. 2009. A systematic review of the application and empirical investigation of search-based test case generation. IEEE Transactions on Software Engineering 36, 6 (2009), 742–762.
[4]
Andrea Arcuri and Xin Yao. 2008. Search based software testing of object-oriented containers. Information Sciences 178, 15 (2008), 3075–3095.
[5]
Arthur Baars, Mark Harman, Youssef Hassoun, Kiran Lakhotia, Phil McMinn, Paolo Tonella, and Tanja Vos. 2011. Symbolic search-based testing. In 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). IEEE, 53–62.
[6]
Thomas Back. 1996. Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms. Oxford university press.
[7]
Patrick Bareiß, Beatriz Souza, Marcelo d’Amorim, and Michael Pradel. 2022. Code generation tools (almost) for free? a study of few-shot, pre-trained language models on code. arXiv preprint arXiv:2206.01335 (2022).
[8]
Luciano Baresi, Pier Luca Lanzi, and Matteo Miraz. 2010. Testful: An evolutionary test approach for java. In 2010 Third International Conference on Software Testing, Verification and Validation. IEEE, 185–194.
[9]
Benoit Baudry, Franck Fleurey, J-M Jézéquel, and Yves Le Traon. 2005. Automatic test case optimization: A bacteriologic algorithm. IEEE software 22, 2 (2005), 76–82.
[10]
Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, and Weizhu Chen. 2022. Codet: Code generation with generated tests. arXiv preprint arXiv:2207.10397 (2022).
[11]
Arghavan Moradi Dakhel, Amin Nikanjam, Vahid Majdinasab, Foutse Khomh, and Michel C Desmarais. 2024. Effective test generation using pre-trained large language models and mutation testing. Information and Software Technology (2024), 107468.
[12]
Elizabeth Dinella, Gabriel Ryan, Todd Mytkowicz, and Shuvendu K Lahiri. 2022. Toga: A neural method for test oracle generation. In Proceedings of the 44th International Conference on Software Engineering. 2130–2141.
[13]
Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416–419.
[14]
Gordon Fraser and Andrea Arcuri. 2012. Whole test suite generation. IEEE Transactions on Software Engineering 39, 2 (2012), 276–291.
[15]
Fred Glover. 1990. Tabu search: A tutorial. Interfaces 20, 4 (1990), 74–94.
[16]
Mark Harman, Sung Gon Kim, Kiran Lakhotia, Phil McMinn, and Shin Yoo. 2010. Optimizing for the number of tests generated in search based test data generation with an application to the oracle cost problem. In 2010 Third International Conference on Software Testing, Verification, and Validation Workshops. IEEE, 182–191.
[17]
Mark Harman, S Afshin Mansouri, and Yuanyuan Zhang. 2009. Search based software engineering: A comprehensive analysis and review of trends techniques and applications. (2009).
[18]
John H Holland. 1992. Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press.
[19]
Kobi Inkumsah and Tao Xie. 2008. Improving structural testing of object-oriented programs via integrating evolutionary testing and symbolic execution. In 2008 23rd IEEE/ACM International Conference on Automated Software Engineering. IEEE, 297–306.
[20]
Sungmin Kang, Juyeon Yoon, and Shin Yoo. 2023. Large language models are few-shot testers: Exploring llm-based general bug reproduction. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 2312–2323.
[21]
James Kennedy and Russell Eberhart. 1995. Particle swarm optimization. In Proceedings of ICNN’95-international conference on neural networks, Vol. 4. ieee, 1942–1948.
[22]
John R Koza. 1994. Genetic programming as a means for programming computers by natural selection. Statistics and computing 4 (1994), 87–112.
[23]
Shuvendu K Lahiri, Aaditya Naik, Georgios Sakkas, Piali Choudhury, Curtis von Veh, Madanlal Musuvathi, Jeevana Priya Inala, Chenglong Wang, and Jianfeng Gao. 2022. Interactive code generation via test-driven user-intent formalization. arXiv preprint arXiv:2208.05950 (2022).
[24]
Caroline Lemieux, Jeevana Priya Inala, Shuvendu K Lahiri, and Siddhartha Sen. 2023. Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 919–931.
[25]
Tsz-On Li, Wenxi Zong, Yibo Wang, Haoye Tian, Ying Wang, and Shing-Chi Cheung. 2023. Finding failure-inducing test cases with chatgpt. arXiv preprint arXiv:2304.11686 (2023).
[26]
Zhe Liu, Chunyang Chen, Junjie Wang, Mengzhuo Chen, Boyu Wu, Xing Che, Dandan Wang, and Qing Wang. 2023. Chatting with gpt-3 for zero-shot human-like mobile automated gui testing. arXiv preprint arXiv:2305.09434 (2023).
[27]
Stephan Lukasczyk and Gordon Fraser. 2022. Pynguin: Automated unit test generation for python. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings. 168–172.
[28]
Jan Malburg and Gordon Fraser. 2011. Combining search-based and constraint-based testing. In 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). IEEE, 436–439.
[29]
Phil McMinn. 2004. Search-based software test data generation: a survey. Software testing, Verification and reliability 14, 2 (2004), 105–156.
[30]
Webb Miller and David L. Spooner. 1976. Automatic generation of floating-point test data. IEEE Transactions on Software Engineering3 (1976), 223–226.
[31]
Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2017. Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets. IEEE Transactions on Software Engineering 44, 2 (2017), 122–158.
[32]
Max Schäfer, Sarah Nadi, Aryaz Eghbali, and Frank Tip. 2023. An empirical evaluation of using large language models for automated unit test generation. IEEE Transactions on Software Engineering (2023).
[33]
Mohammed Latif Siddiq, Joanna Santos, Ridwanul Hasan Tanvir, Noshin Ulfat, Fahmid Al Rifat, and Vinicius Carvalho Lopes. 2023. Exploring the effectiveness of large language models in generating unit tests. arXiv preprint arXiv:2305.00418 (2023).
[34]
Dimitri Stallenberg, Mitchell Olsthoorn, and Annibale Panichella. 2022. Guess what: Test case generation for Javascript with unsupervised probabilistic type inference. In International Symposium on Search Based Software Engineering. Springer, 67–82.
[35]
Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, and Neel Sundaresan. 2020. Unit test case generation with transformers and focal context. arXiv preprint arXiv:2009.05617 (2020).
[36]
PT Van Larrhoven and EH Aarts. 1988. Simulated annealing: theory and practice.
[37]
Cody Watson, Michele Tufano, Kevin Moran, Gabriele Bavota, and Denys Poshyvanyk. 2020. On learning meaningful assert statements for unit test cases. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1398–1409.
[38]
Ratnadira Widyasari, Sheng Qin Sim, Camellia Lok, Haodi Qi, Jack Phan, Qijin Tay, Constance Tan, Fiona Wee, Jodie Ethelda Tan, Yuheng Yieh, 2020. Bugsinpy: a database of existing bugs in python programs to enable controlled testing and debugging studies. In Proceedings of the 28th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering. 1556–1560.
[39]
Zhuokui Xie, Yinghao Chen, Chen Zhi, Shuiguang Deng, and Jianwei Yin. 2023. ChatUniTest: a ChatGPT-based automated unit test generation tool. arXiv preprint arXiv:2305.04764 (2023).
[40]
Zhiqiang Yuan, Yiling Lou, Mingwei Liu, Shiji Ding, Kaixin Wang, Yixuan Chen, and Xin Peng. 2023. No more manual tests? evaluating and improving chatgpt for unit test generation. arXiv preprint arXiv:2305.04207 (2023).

Index Terms

  1. Optimizing Search-Based Unit Test Generation with Large Language Models: An Empirical Study

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    Internetware '24: Proceedings of the 15th Asia-Pacific Symposium on Internetware
    July 2024
    518 pages
    ISBN:9798400707056
    DOI:10.1145/3671016
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 July 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Large Language Model
    2. Search-based Testing
    3. Unit Test

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    Internetware 2024
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 55 of 111 submissions, 50%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 156
      Total Downloads
    • Downloads (Last 12 months)156
    • Downloads (Last 6 weeks)28
    Reflects downloads up to 03 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media