research-article

Exploring and Improving Code Completion for Test Code

Authors:

Xin XiaAuthors Info & Claims

ICPC '24: Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension

Pages 137 - 148

https://doi.org/10.1145/3643916.3644421

Published: 13 June 2024 Publication History

Abstract

Code completion is an important feature in Integrated Development Environments (IDEs). These years, researchers have been making efforts for intelligent code completion. However, existing work on intelligent code completion either only considered production code, or did not distinguish between production code and test code. It is unclear how effective existing completion models are for test code completion, nor whether we can further improve it. In this work, we focus on the completion of test code. We first find through experiments that completion models for production code are suboptimal for test code completion. Then we analyze the specific characteristics of test code, and observe that test code has inter- and intra-project similarities, and a strong relationship with its focal class and other production classes depending on the focal class (i.e., focal-related code). By incorporating test code from other projects to fine-tune existing models, we leverage the inter-project similarity of test code to improve the completion of tokens specific to test code. By introducing a local component and constructing existing test code as well as the focal-related code in the project as references, we enhance existing code completion models with the intra-project similarity and the focal-related code of test code. Experiments show that each characteristic of test code we exploit can bring substantial improvement to test code completion and our integrated framework outperforms other baseline frameworks. Compared to the base completion model, on token-level completion, our optimal model for test code completion relatively improves all-token and identifier completion accuracy by 7.68% and 19.96%, respectively; on line-level completion, it relatively improves edit-distance similarity and exact-match metrics by 8.89% and 22.82%, respectively. Moreover, we perform error analysis and point out potential directions for future work.

References

[1]

2023. CodeGPT Checkpoint. https://huggingface.co/microsoft/CodeGPT-small-java-adaptedGPT2

[2]

2023. Eclipse Ditto. https://github.com/eclipse/ditto

[3]

2023. Github Copilot. https://github.com/features/copilot

[4]

2023. Gradle. https://gradle.org

[5]

2023. JUnit. https://junit.org

[6]

2023. Maven. https://maven.apache.org

[7]

2023. Our Replication Package. https://figshare.com/s/cc0ebe738e97a4747776

[8]

2023. Pytest. https://docs.pytest.org

[9]

2023. TestNG. https://testng.org

[10]

2023. Unittest. https://docs.python.org/3/library/unittest.html

[11]

2023. UniXcoder Checkpoint. https://huggingface.co/microsoft/unixcoder-base

[12]

WU Ahmad, S Chakraborty, B Ray, and KW Chang. 2021. Unified pre-training for program understanding and generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

[13]

Miltiadis Allamanis and Charles Sutton. 2013. Mining source code repositories at massive scale using language modeling. In 2013 10th working conference on mining software repositories (MSR). IEEE, 207--216.

[14]

Uri Alon, Frank Xu, Junxian He, Sudipta Sengupta, Dan Roth, and Graham Neubig. 2022. Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval. In International Conference on Machine Learning. PMLR, 468--485.

[15]

Moritz Beller, Georgios Gousios, Annibale Panichella, and Andy Zaidman. 2015. When, how, and why developers (do not) test in their IDEs. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 179--190.

Digital Library

[16]

Egor Bogomolov, Sergey Zhuravlev, Egor Spirin, and Timofey Bryksin. 2022. Assessing Project-Level Fine-Tuning of ML4SE Models. arXiv preprint arXiv:2206.03333 (2022).

[17]

Marcel Bruch, Martin Monperrus, and Mira Mezini. 2009. Learning from examples to improve code completion systems. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. 213--222.

Digital Library

[18]

Adnan Causevic, Daniel Sundmark, and Sasikumar Punnekkat. 2010. An industrial survey on contemporary aspects of software testing. In 2010 Third International Conference on Software Testing, Verification and Validation. IEEE, 393--401.

Digital Library

[19]

Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, and Weizhu Chen. 2022. Codet: Code generation with generated tests. arXiv preprint arXiv:2207.10397 (2022).

[20]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).

[21]

Elizabeth Dinella, Gabriel Ryan, Todd Mytkowicz, and Shuvendu K Lahiri. 2022. TOGA: a neural method for test oracle generation. In Proceedings of the 44th International Conference on Software Engineering. 2130--2141.

Digital Library

[22]

Christine Franks, Zhaopeng Tu, Premkumar Devanbu, and Vincent Hellendoorn. 2015. Cacheca: A cache language model based code suggestion tool. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 2. IEEE, 705--708.

[23]

Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416--419.

Digital Library

[24]

Mohammad Ghafari, Carlo Ghezzi, and Konstantin Rubinov. 2015. Automatically identifying focal methods under test in unit test cases. In 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 61--70.

[25]

Giovanni Grano, Simone Scalabrino, Harald C Gall, and Rocco Oliveto. 2018. An empirical investigation on the readability of manual and generated test cases. In Proceedings of the 26th Conference on Program Comprehension. 348--351.

Digital Library

[26]

Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. UniXcoder: Unified Cross-Modal Pre-training for Code Representation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22--27, 2022. 7212--7225.

[27]

Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, et al. 2021. GraphCodeBERT: Pre-training Code Representations with Data Flow. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3--7, 2021.

[28]

Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, and Miltiadis Allamanis. 2021. Learning to complete code with sketches. In International Conference on Learning Representations.

[29]

Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020. 8342--8360.

[30]

Vincent J Hellendoorn, Sebastian Proksch, Harald C Gall, and Alberto Bacchelli. 2019. When code completion fails: A case study on real-world completions. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 960--970.

Digital Library

[31]

Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In 2012 34th International Conference on Software Engineering (ICSE). IEEE, 837--847.

Digital Library

[32]

Daqing Hou and David M Pletcher. 2010. Towards a better code completion system by API grouping, filtering, and popularity-based ranking. In Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering. 26--30.

Digital Library

[33]

Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2019. Codesearchnet challenge: Evaluating the state of semantic code search. arXiv preprint arXiv:1909.09436 (2019).

[34]

Maliheh Izadi, Roberta Gismondi, and Georgios Gousios. 2022. Codefill: Multitoken code completion by jointly learning from structure and naming sequences. In Proceedings of the 44th International Conference on Software Engineering. 401--412.

Digital Library

[35]

Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with gpus. IEEE Transactions on Big Data 7, 3 (2019), 535--547.

[36]

Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6769--6781.

[37]

Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. 2020. Generalization through Memorization: Nearest Neighbor Language Models. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26--30, 2020.

[38]

Seohyun Kim, Jinman Zhao, Yuchi Tian, and Satish Chandra. 2021. Code prediction by feeding trees to transformers. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 150--162.

Digital Library

[39]

Shuvendu K Lahiri, Aaditya Naik, Georgios Sakkas, Piali Choudhury, Curtis von Veh, Madanlal Musuvathi, Jeevana Priya Inala, Chenglong Wang, and Jianfeng Gao. 2022. Interactive Code Generation via Test-Driven User-Intent Formalization. arXiv preprint arXiv:2208.05950 (2022).

[40]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871--7880.

[41]

Jian Li, Yue Wang, Michael R Lyu, and Irwin King. 2018. Code completion with neural attention and pointer networks. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 4159--25.

[42]

Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, et al. 2022. Competition-level code generation with alphacode. Science 378, 6624 (2022), 1092--1097.

[43]

Chang Liu, Xin Wang, Richard Shin, Joseph E Gonzalez, and Dawn Song. 2016. Neural code completion. (2016).

[44]

Fang Liu, Ge Li, Bolin Wei, Xin Xia, Zhiyi Fu, and Zhi Jin. 2020. A self-attentional neural architecture for code completion with multi-task learning. In Proceedings of the 28th International Conference on Program Comprehension. 37--47.

Digital Library

[45]

Fang Liu, Ge Li, Bolin Wei, Xin Xia, Zhiyi Fu, and Zhi Jin. 2022. A unified multitask learning model for AST-level and token-level code completion. Empirical Software Engineering 27, 4 (2022), 91.

Digital Library

[46]

Fang Liu, Ge Li, Yunfei Zhao, and Zhi Jin. 2020. Multi-task learning based pretrained language model for code completion. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 473--485.

Digital Library

[47]

Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung-won Hwang, and Alexey Svyatkovskiy. 2022. ReACC: A Retrieval-Augmented Code Completion Framework. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 6227--6240.

[48]

Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, et al. 2021. Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664 (2021).

[49]

Antonio Mastropaolo, Simone Scalabrino, Nathan Cooper, David Nader Palacio, Denys Poshyvanyk, Rocco Oliveto, and Gabriele Bavota. 2021. Studying the usage of text-to-text transfer transformer to support code-related tasks. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 336--347.

Digital Library

[50]

Hussan Munir, Krzysztof Wnuk, Kai Petersen, and Misagh Moayyed. 2014. An experimental evaluation of test driven development vs. test-last development with industry professionals. In proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. 1--10.

Digital Library

[51]

Anh Tuan Nguyen, Michael Hilton, Mihai Codoban, Hoan Anh Nguyen, Lily Mast, Eli Rademacher, Tien N Nguyen, and Danny Dig. 2016. API code recommendation using statistical learning from fine-grained changes. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 511--522.

Digital Library

[52]

Anh Tuan Nguyen and Tien N Nguyen. 2015. Graph-based statistical language model for code. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 858--868.

[53]

Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N Nguyen. 2013. A statistical semantic language model for source code. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. 532--542.

Digital Library

[54]

Changan Niu, Chuanyi Li, Vincent Ng, Jidong Ge, Liguo Huang, and Bin Luo. 2022. SPT-code: sequence-to-sequence pre-training for learning source code representations. In Proceedings of the 44th International Conference on Software Engineering. 2006--2018.

Digital Library

[55]

Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815--816.

[56]

Nathan H Petschenik. 1985. Building awareness of system testing issues. In Proceedings of the 8th international conference on Software engineering. 182--188.

Digital Library

[57]

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.

[58]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485--5551.

Digital Library

[59]

Romain Robbes and Michele Lanza. 2008. How program history can improve code completion. In 2008 23rd IEEE/ACM International Conference on Automated Software Engineering. IEEE, 317--326.

Digital Library

[60]

Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, and Neel Sundaresan. 2020. Intellicode compose: Code generation using transformer. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1433--1443.

Digital Library

[61]

Zhaopeng Tu, Zhendong Su, and Premkumar Devanbu. 2014. On the localness of software. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 269--280.

Digital Library

[62]

Michele Tufano, Shao Kun Deng, Neel Sundaresan, and Alexey Svyatkovskiy. 2022. Methods2Test: A dataset of focal methods mapped to test cases. In Proceedings of the 19th International Conference on Mining Software Repositories. 299--303.

Digital Library

[63]

Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, and Neel Sundaresan. 2020. Unit test case generation with transformers and focal context. arXiv preprint arXiv:2009.05617 (2020).

[64]

Arie Van Deursen, Leon Moonen, Alex Van Den Bergh, and Gerard Kok. 2001. Refactoring test code. In Proceedings of the 2nd international conference on extreme programming and flexible processes in software engineering (XP2001). Citeseer, 92--95.

[65]

Yue Wang, Weishi Wang, Shafiq Joty, and Steven CH Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 8696--8708.

[66]

Cody Watson, Michele Tufano, Kevin Moran, Gabriele Bavota, and Denys Poshyvanyk. 2020. On learning meaningful assert statements for unit test cases. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1398--1409.

Digital Library

[67]

Robert White and Jens Krinke. 2018. Testnmt: Function-to-test neural machine translation. In Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering. 30--33.

Digital Library

[68]

Frank F Xu, Junxian He, Graham Neubig, and Vincent J Hellendoorn. 2022. Capturing structural locality in non-parametric language models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25--29, 2022.

[69]

Hao Yu, Yiling Lou, Ke Sun, Dezhi Ran, Tao Xie, Dan Hao, Ying Li, Ge Li, and Qianxiang Wang. 2022. Automated assertion generation via information retrieval and its integration with deep learning. In Proceedings of the 44th International Conference on Software Engineering. 163--174.

Digital Library

Index Terms

Exploring and Improving Code Completion for Test Code
1. Software and its engineering
  1. Software notations and tools
    1. Software maintenance tools

Recommendations

Don’t Complete It! Preventing Unhelpful Code Completion for Productive and Sustainable Neural Code Completion Systems
Currently, large pre-trained language models are widely applied in neural code completion systems. Though large code models significantly outperform their smaller counterparts, around 70% of displayed code completions from Github Copilot are not accepted ...
Principled syntactic code completion using placeholders
SLE 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Software Language Engineering

Principled syntactic code completion enables developers to change source code by inserting code templates, thus increasing developer efficiency and supporting language exploration. However, existing code completion systems are ad-hoc and neither ...
Code Completion from Abbreviated Input
ASE '09: Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering

Abbreviation Completion is a novel technique to improve the efficiency of code-writing by supporting code completion of multiple keywords based on non-predefined abbreviated input -- a different approach from conventional code completion that finds one ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICPC '24: Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension

April 2024

487 pages

ISBN:9798400705861

DOI:10.1145/3643916

Chair:
Igor Steinmacher,
Co-chair:
Mario Linares-Vasquez,
Program Chair:
Kevin Patrick Moran,
Program Co-chair:
Olga Baysal

Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

ICPC '24

Sponsor:

SIGSOFT

ICPC '24: 32nd IEEE/ACM International Conference on Program Comprehension

April 15 - 16, 2024

Lisbon, Portugal

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
66
Total Downloads

Downloads (Last 12 months)66
Downloads (Last 6 weeks)21

Reflects downloads up to 01 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents