Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3691620.3696020acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

MR-Adopt: Automatic Deduction of Input Transformation Function for Metamorphic Testing

Published: 27 October 2024 Publication History

Abstract

While a recent study reveals that many developer-written test cases can encode a reusable Metamorphic Relation (MR), over 70% of them directly hard-code the source input and follow-up input in the encoded relation. Such encoded MRs, which do not contain an explicit input transformation to transform the source inputs to corresponding follow-up inputs, cannot be reused with new source inputs to enhance test adequacy.
In this paper, we propose MR-Adopt (Automatic Deduction Of inPut Transformation) to automatically deduce the input transformation from the hard-coded source and follow-up inputs, aiming to enable the encoded MRs to be reused with new source inputs. With typically only one pair of source and follow-up inputs available in an MR-encoded test case as the example, we leveraged LLMs to understand the intention of the test case and generate additional examples of source-followup input pairs. This helps to guide the generation of input transformations generalizable to multiple source inputs. Besides, to mitigate the issue that LLMs generate erroneous code, we refine LLM-generated transformations by removing MR-irrelevant code elements with data-flow analysis. Finally, we assess candidate transformations based on encoded output relations and select the best transformation as the result. Evaluation results show that MR-Adopt can generate input transformations applicable to all experimental source inputs for 72.00% of encoded MRs, which is 33.33% more than using vanilla GPT-3.5. By incorporating MR-Adopt-generated input transformations, encoded MR-based test cases can effectively enhance the test adequacy, increasing the line coverage and mutation score by 10.62% and 18.91%, respectively.

References

[1]
Alibaba. 2024. codeqwen1.5. Retrieved June 6, 2024 from https://qwenlm.github.io/blog/codeqwen1.5/
[2]
Rajeev Alur, Rastislav Bodík, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2013. Syntax-guided synthesis. In Formal Methods in Computer-Aided Design. IEEE, 1--8.
[3]
Jon Ayerdi, Valerio Terragni, Aitor Arrieta, Paolo Tonella, Goiuria Sagardui, and Maite Arratibel. 2021. Generating metamorphic relations for cyber-physical systems with genetic programming: an industrial case study. In Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021). 1264--1274.
[4]
Jon Ayerdi, Valerio Terragni, Gunel Jahangirova, Aitor Arrieta, and Paolo Tonella. 2024. GenMorph: Automatically Generating Metamorphic Relations via Genetic Programming. IEEE Transactions on Software Engineering (2024), 1--12.
[5]
Jialun Cao, Meiziniu Li, Yeting Li, Ming Wen, Shing-Chi Cheung, and Haiming Chen. 2022. SemMT: A Semantic-Based Testing Approach for Machine Translation Systems. ACM Transactions on Software Engineering and Methodology 31, 2 (2022), 34e:1--34e:36.
[6]
Jialun Cao, Wuqi Zhang, and Shing-Chi Cheung. 2024. Concerned with Data Contamination? Assessing Countermeasures in Code Language Model. CoRR abs/2403.16898 (2024). arXiv:2403.16898
[7]
Junkai Chen, Xing Hu, Zhenhao Li, Cuiyun Gao, Xin Xia, and David Lo. 2024. Code Search is All You Need? Improving Code Suggestions with Code Search. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (ICSE '24). Association for Computing Machinery, New York, NY, USA, Article 73, 13 pages.
[8]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pondé de Oliveira Pinto, and et al. 2021. Evaluating Large Language Models Trained on Code. CoRR abs/2107.03374 (2021). arXiv:2107.03374
[9]
Songqiang Chen, Shuo Jin, and Xiaoyuan Xie. 2021. Testing Your Question Answering Software via Asking Recursively. In International Conference on Automated Software Engineering. IEEE, 104--116.
[10]
Tsong Yueh Chen, Fei-Ching Kuo, Huai Liu, Pak-Lok Poon, Dave Towey, T. H. Tse, and Zhi Quan Zhou. 2018. Metamorphic Testing: A Review of Challenges and Opportunities. Comput. Surveys 51, 1 (2018), 4:1--4:27.
[11]
DeepSeek. 2024. deepseek-coder-7b-instruct-v1.5. Retrieved June 6, 2024 from https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5
[12]
DeepSeek. 2024. DeepSeek-Coder Updates. Retrieved June 6, 2024 from https://github.com/deepseek-ai/DeepSeek-Coder/issues/89
[13]
Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, and Tao Gui. 2024. StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback. CoRR abs/2402.01391 (2024). arXiv:2402.01391
[14]
Xueying Du, Mingwei Liu, Kaixin Wang, Hanlin Wang, Junwei Liu, Yixuan Chen, Jiayi Feng, Chaofeng Sha, Xin Peng, and Yiling Lou. 2024. Evaluating Large Language Models in Class-Level Code Generation. In nternational Conference on Software Engineering. ACM, 81:1--81:13.
[15]
Aryaz Eghbali and Michael Pradel. 2024. De-Hallucinator: Iterative Grounding for LLM-Based Code Completion. CoRR abs/2401.01701 (2024). arXiv:2401.01701
[16]
Evalplus. 2024. leaderboard. Retrieved June 6, 2024 from https://evalplus.github.io/leaderboard.html
[17]
Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: automatic test suite generation for object-oriented software. In Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 416--419.
[18]
Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Ge Li, Zhi Jin, Xiaoguang Mao, and Xiangke Liao. 2024. Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning. In International Conference on Software Engineering. ACM, 39:1--39:13.
[19]
Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. In Symposium on Principles of Programming Languages, 2011. ACM, 317--330.
[20]
Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y. K. Li, Fuli Luo, Yingfei Xiong, and Wenfeng Liang. 2024. DeepSeek-Coder: When the Large Language Model Meets Programming - The Rise of Code Intelligence. CoRR abs/2401.14196 (2024). arXiv:2401.14196
[21]
Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. 2020. The Curious Case of Neural Text Degeneration. In International Conference on Learning Representations. OpenReview.net.
[22]
Kaifeng Huang, Bihuan Chen, Congying Xu, Ying Wang, Bowen Shi, Xin Peng, Yijian Wu, and Yang Liu. 2022. Characterizing usages, updates and risks of third-party libraries in Java projects. Empirical Software Engineering 27, 4 (2022), 90.
[23]
Maliheh Izadi, Jonathan Katzy, Tim van Dam, Marc Otten, Razvan Mihai Popescu, and Arie van Deursen. 2024. Language Models for Code Completion: A Practical Evaluation. In International Conference on Software Engineering. ACM, 79:1--79:13.
[24]
JavaParser. 2024. JavaParser. Retrieved June 6, 2024 from https://javaparser.org/
[25]
Shuyang Jiang, Yuhao Wang, and Yu Wang. 2023. SelfEvolve: A Code Evolution Framework via Large Language Models. CoRR abs/2306.02907 (2023). arXiv:2306.02907
[26]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. In Conference on Programming Language Design and Implementation. ACM, 216--226.
[27]
Caroline Lemieux, Jeevana Priya Inala, Shuvendu K. Lahiri, and Siddhartha Sen. 2023. CodaMosa: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models. In International Conference on Software Engineering. IEEE, 919--931.
[28]
Chengshu Li, Jacky Liang, Andy Zeng, Xinyun Chen, Karol Hausman, Dorsa Sadigh, Sergey Levine, Li Fei-Fei, Fei Xia, and Brian Ichter. 2023. Chain of Code: Reasoning with a Language Model-Augmented Code Emulator. CoRR abs/2312.04474 (2023). arXiv:2312.04474
[29]
Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, and et al. 2023. StarCoder: may the source be with you! CoRR abs/2305.06161 (2023). arXiv:2305.06161
[30]
Yujia Li, David H. Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, and et al. 2022. Competition-Level Code Generation with AlphaCode. CoRR abs/2203.07814 (2022). arXiv:2203.07814
[31]
Mikael Lindvall, Dharmalingam Ganesan, Ragnar Ardal, and Robert E. Wiegand. 2015. Metamorphic Model-Based Testing Applied on NASA DAT - An Experience Report. In International Conference on Software Engineering. IEEE Computer Society, 129--138.
[32]
Huai Liu, Fei-Ching Kuo, Dave Towey, and Tsong Yueh Chen. 2014. How Effectively Does Metamorphic Testing Alleviate the Oracle Problem? IEEE Transactions on Software Engineering 40, 1 (2014), 4--22.
[33]
Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, and Daxin Jiang. 2023. WizardCoder: Empowering Code Large Language Models with Evol-Instruct. CoRR abs/2306.08568 (2023). arXiv:2306.08568
[34]
Haoyang Ma, Qingchao Shen, Yongqiang Tian, Junjie Chen, and Shing-Chi Cheung. 2023. Fuzzing Deep Learning Compilers with HirGen. In International Symposium on Software Testing and Analysis. ACM, 248--260.
[35]
Lipeng Ma, Weidong Yang, Bo Xu, Sihang Jiang, Ben Fei, Jiaqing Liang, Mingjie Zhou, and Yanghua Xiao. 2024. KnowLog: Knowledge Enhanced Pre-trained Language Model for Log Understanding. In International Conference on Software Engineering. ACM, 32:1--32:13.
[36]
Qiuyang Mang, Aoyang Fang, Boxi Yu, Hanfei Chen, and Pinjia He. 2024. Testing Graph Database Systems via Equivalent Query Rewriting. In International Conference on Software Engineering. ACM, 143:1--143:12.
[37]
Meta. 2024. Llama-3-8B-Instruct. Retrieved June 6, 2024 from https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
[38]
Meta. 2024. Meta-Llama-3 Updates. Retrieved June 6, 2024 from https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
[39]
Daye Nam, Andrew Macvean, Vincent J. Hellendoorn, Bogdan Vasilescu, and Brad A. Myers. 2024. Using an LLM to Help With Code Understanding. In International Conference on Software Engineering. ACM, 97:1--97:13.
[40]
Ansong Ni, Srini Iyer, Dragomir Radev, Veselin Stoyanov, Wen-Tau Yih, Sida I. Wang, and Xi Victoria Lin. 2023. LEVER: Learning to Verify Language-to-Code Generation with Execution. In International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202). PMLR, 26106--26128.
[41]
Agustín Nolasco, Facundo Molina, Renzo Degiovanni, Alessandra Gorla, Diego Garbervetsky, Mike Papadakis, Sebastián Uchitel, Nazareno Aguirre, and Marcelo F. Frias. 2024. Abstraction-Aware Inference of Metamorphic Relations. Proceedings of the ACM on Software Engineering 1, FSE (2024), 450--472.
[42]
OpenAI. 2024. GPT-3.5. Retrieved June 6, 2024 from https://platform.openai.com/docs/models/
[43]
OpenAI. 2024. GPT-3.5 Turbo Updates. Retrieved June 6, 2024 from https://help.openai.com/en/articles/8555514-gpt-3-5-turbo-updates
[44]
Carlos Pacheco and Michael D. Ernst. 2007. Randoop: feedback-directed random testing for Java. In Conference on Object-Oriented Programming, Systems, Languages, and Applications. ACM, 815--816.
[45]
Rangeet Pan, Vu Le, Nachiappan Nagappan, Sumit Gulwani, Shuvendu K. Lahiri, and Mike Kaufman. 2021. Can Program Synthesis be Used to Learn Merge Conflict Resolutions? An Empirical Analysis. In IEEE/ACM International Conference on Software Engineering. IEEE, 785--796.
[46]
Pitest. 2024. Pitest. Retrieved June 6, 2024 from https://pitest.org/
[47]
Sergio Segura, Gordon Fraser, Ana Belén Sánchez, and Antonio Ruiz Cortés. 2016. A Survey on Metamorphic Testing. IEEE Transactions on Software Engineering 42, 9 (2016), 805--824.
[48]
Sergio Segura, José Antonio Parejo, Javier Troya, and Antonio Ruiz Cortés. 2018. Metamorphic Testing of RESTful Web APIs. IEEE Transactions on Software Engineering 44, 11 (2018), 1083--1099.
[49]
Sergio Segura, José Antonio Parejo, Javier Troya, and Antonio Ruiz Cortés. 2018. Metamorphic testing of RESTful web APIs. In International Conference on Software Engineering. ACM, 882.
[50]
Bo Shen, Jiaxin Zhang, Taihong Chen, Daoguang Zan, Bing Geng, An Fu, Muhan Zeng, Ailun Yu, Jichuan Ji, Jingyang Zhao, Yuenan Guo, and Qianxiang Wang. 2023. PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback. CoRR abs/2307.14936 (2023). arXiv:2307.14936
[51]
Seung Yeob Shin, Fabrizio Pastore, Domenico Bianculli, and Alexandra Baicoianu. 2024. Towards Generating Executable Metamorphic Relations Using Large Language Models. CoRR abs/2401.17019 (2024). arXiv:2401.17019
[52]
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. In International Conference on Object-Oriented Programming, Systems, Languages, and Applications,. ACM, 849--863.
[53]
Chang-Ai Sun, Yiqiang Liu, Zuoyi Wang, and W. K. Chan. 2016. μMT: a data mutation directed metamorphic relation acquisition methodology. In International Workshop on Metamorphic Testing. ACM, 12--18.
[54]
Yutian Tang, Zhijie Liu, Zhichao Zhou, and Xiapu Luo. 2024. ChatGPT vs SBST: A Comparative Assessment of Unit Test Suite Generation. IEEE Transactions on Software Engineering (2024), 1--19.
[55]
Valerio Terragni, Gunel Jahangirova, Paolo Tonella, and Mauro Pezzè. 2020. Evolutionary Improvement of Assertion Oracles. In Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1178--1189.
[56]
MR-Adopt. 2024. MR-Adopt. Retrieved June 6, 2024 from https://mr-adopt.github.io/
[57]
Christos Tsigkanos, Pooja Rani, Sebastian Müller, and Timo Kehrer. 2023. Variable Discovery with Large Language Models for Metamorphic Testing of Scientific Software. In Computational Science - ICCS 2023 - 23rd International Conference, Prague, Czech Republic, July 3--5, 2023, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 14073). Springer, 321--335.
[58]
Ying Wang, Bihuan Chen, Kaifeng Huang, Bowen Shi, Congying Xu, Xin Peng, Yijian Wu, and Yang Liu. 2020. An Empirical Study of Usages, Updates and Risks of Third-Party Libraries in Java Projects. In International Conference on Software Maintenance and Evolution. IEEE, 35--45.
[59]
Taylor Webb, Keith J Holyoak, and Hongjing Lu. 2023. Emergent analogical reasoning in large language models. Nature Human Behaviour 7, 9 (2023), 1526--1541.
[60]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Conference on Neural Information Processing Systems.
[61]
Yi Wu, Nan Jiang, Hung Viet Pham, Thibaud Lutellier, Jordan Davis, Lin Tan, Petr Babkin, and Sameena Shah. 2023. How Effective Are Neural Networks for Fixing Security Vulnerabilities. In International Symposium on Software Testing and Analysis. ACM, 1282--1294.
[62]
Chunqiu Steven Xia, Matteo Paltenghi, Jia Le Tian, Michael Pradel, and Lingming Zhang. 2024. Fuzz4All: Universal Fuzzing with Large Language Models. In International Conference on Software Engineering. ACM, 126:1--126:13.
[63]
Xiaoyuan Xie, Shuo Jin, and Songqiang Chen. 2023. qaAskeR+: a novel testing method for question answering software via asking recursive questions. Automated Software Engineering 30, 1 (2023), 14.
[64]
Xiaoyuan Xie, Shuo Jin, Songqiang Chen, and Shing-Chi Cheung. 2024. Word Closure-Based Metamorphic Testing for Machine Translation. ACM Transactions on Software Engineering and Methodology (jul 2024).
[65]
Congying Xu, Valerio Terragni, Hengcheng Zhu, Jiarong Wu, and Shing-Chi Cheung. 2024. MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases. ACM Transactions on Software Engineering and Methodology 33, 6 (2024), 150.
[66]
Chen Yang, Junjie Chen, Bin Lin, Jianyi Zhou, and Ziqi Wang. 2024. Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis. CoRR abs/2404.04966 (2024). arXiv:2404.04966
[67]
Zhen Yang, Fang Liu, Zhongxing Yu, Jacky Wai Keung, Jia Li, Shuo Liu, Yifan Hong, Xiaoxue Ma, Zhi Jin, and Ge Li. 2024. Exploring and Unleashing the Power of Large Language Models in Automated Code Translation. CoRR abs/2404.14646 (2024). arXiv:2404.14646
[68]
Michihiro Yasunaga, Xinyun Chen, Yujia Li, Panupong Pasupat, Jure Leskovec, Percy Liang, Ed H. Chi, and Denny Zhou. 2023. Large Language Models as Analogical Reasoners. CoRR abs/2310.01714 (2023). arXiv:2310.01714
[69]
Yuanyuan Yuan, Shuai Wang, Mingyue Jiang, and Tsong Yueh Chen. 2021. Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing. In Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 16908--16917.
[70]
Zhiqiang Yuan, Mingwei Liu, Shiji Ding, Kaixin Wang, Yixuan Chen, Xin Peng, and Yiling Lou. 2024. Evaluating and improving chatgpt for unit test generation. Proceedings of the ACM on Software Engineering 1, FSE (2024), 1703--1726.
[71]
Zhiqiang Yuan, Yiling Lou, Mingwei Liu, Shiji Ding, Kaixin Wang, Yixuan Chen, and Xin Peng. 2023. No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation. CoRR abs/2305.04207 (2023). arXiv:2305.04207
[72]
Bo Zhang, Hongyu Zhang, Junjie Chen, Dan Hao, and Pablo Moscato. 2019. Automatic Discovery and Cleansing of Numerical Metamorphic Relations. In IEEE International Conference on Software Maintenance and Evolution. IEEE, 235--245.
[73]
Jie Zhang, Junjie Chen, Dan Hao, Yingfei Xiong, Bing Xie, Lu Zhang, and Hong Mei. 2014. Search-based inference of polynomial metamorphic relations. In ACM/IEEE International Conference on Automated Software Engineering. ACM, 701--712.
[74]
Kechi Zhang, Jia Li, Ge Li, Xianjie Shi, and Zhi Jin. 2024. CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. CoRR abs/2401.07339 (2024). arXiv:2401.07339
[75]
Zhi Quan Zhou, Liqun Sun, Tsong Yueh Chen, and Dave Towey. 2020. Metamorphic Relations for Enhancing System Understanding and Use. IEEE Transactions on Software Engineering 46, 10 (2020), 1120--1154.

Index Terms

  1. MR-Adopt: Automatic Deduction of Input Transformation Function for Metamorphic Testing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASE '24: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering
    October 2024
    2587 pages
    ISBN:9798400712487
    DOI:10.1145/3691620
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2024

    Check for updates

    Author Tags

    1. software testing
    2. metamorphic testing
    3. metamorphic relation
    4. input transformation
    5. code generation
    6. large language models

    Qualifiers

    • Research-article

    Funding Sources

    • National Science Foundation of China
    • Hong Kong Research Grant Council/General Research Fund

    Conference

    ASE '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 82 of 337 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 71
      Total Downloads
    • Downloads (Last 12 months)71
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 24 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media