research-article

Public Access

Towards automatically generating descriptive names for unit tests

Authors:

James ClauseAuthors Info & Claims

ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering

Pages 625 - 636

https://doi.org/10.1145/2970276.2970342

Published: 25 August 2016 Publication History

Abstract

During maintenance, developers often need to understand the purpose of a test. One of the most potentially useful sources of information for understanding a test is its name. Ideally, test names are descriptive in that they accurately summarize both the scenario and the expected outcome of the test. Despite the benefits of being descriptive, test names often fall short of this goal. In this paper we present a new approach for automatically generating descriptive names for existing test bodies. Using a combination of natural-language program analysis and text generation, the technique creates names that summarize the test's scenario and the expected outcome. The results of our evaluation show that, (1) compared to alternative approaches, the names generated by our technique are significantly more similar to human-generated names and are nearly always preferred by developers, (2) the names generated by our technique are preferred over or are equivalent to the original test names in 83% of cases, and (3) our technique is several orders of magnitude faster than manually writing test names.

References

[1]

M. Allamanis, E. T. Barr, C. Bird, and C. Sutton. Suggesting accurate method and class names. In Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pages 38– 49, 2015.

Digital Library

[2]

Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society.

[3]

Series B (Methodological), 57(1):289–300, 1995.

[4]

S. Butler, M. Wermelinger, Y. Yu, and H. Sharp. Improving the tokenisation of identifier names. In Proceedings of the 25th European conference on Object-oriented programming, pages 130–154, 2011.

Digital Library

[5]

A. Corazza, S. D. Martino, and V. Maggio. LINSEN: An approach to split identifiers and expand abbreviations with linear complexity. In Proceedings of the 2012 IEEE International Conference on Software Maintenance, pages 233–242, 2012.

Digital Library

[6]

B. Eddy, J. Robinson, N. Kraft, and J. Carver. Evaluating source code summarization techniques: Replication and expansion. In Proceedings of the 2013 IEEE 21st International Conference on Program Comprehension, pages 13–22, 2013.

[7]

E. Enslen, E. Hill, L. Pollock, and K. Vijay-Shanker. Mining source code to automatically split identifiers for software analysis. In Proceedings of the 6th International Working Conference on Mining Software Repositories, pages 71–80, 2009.

Digital Library

[8]

L. Guerrouj, M. Di Penta, G. Antoniol, and Y.-G. Guéhéneuc. TIDIER: An identifier splitting approach using speech recognition techniques. Journal of Software: Evolution and Process, 25:575–599, 2013.

[9]

S. Gupta, S. Malik, L. Pollock, and K. Vijay-Shanker. Part-of-speech tagging of program identifiers for improved text-based software engineering tools. In Proceedings of the 21st IEEE International Conference on Program Comprehension, pages 3–12, 2013.

[10]

S. Haiduc, J. Aponte, L. Moreno, and A. Marcus. On the use of automated text summarization techniques for summarizing source code. In Proceedings of the 17th Working Conference on Reverse Engineering, pages 35–44, 2010.

Digital Library

[11]

E. Hill, Z. P. Fry, H. Boyd, G. Sridhara, Y. Novikova, L. Pollock, and K. Vijay-Shanker. AMAP: Automatically mining abbreviation expansions in programs to enhance software maintenance tools. In Proceedings of the 5th International Working Conference on Mining Software Repositories, pages 79–88, 2008.

Digital Library

[12]

E. Hill, D. Binkley, D. Lawrie, L. Pollock, and K. Vijay-Shanker. An empirical study of identifier splitting techniques. Empirical Software Engineering, 19(6):1754– 1780, 2014.

Digital Library

[13]

E. Høst and B. Østvold. Canonical method names for Java. In Software Language Engineering, volume 6563 of Lecture Notes in Computer Science, pages 226–245. 2011.

[14]

E. W. Høst and B. M. Østvold. The programmer’s lexicon, volume I: The verbs. In SCAM ’07: Proceedings of the 7th IEEE International Working Conference on Source Code Analysis and Manipulation, pages 193–202, 2007.

Digital Library

[15]

E. W. Høst and B. M. Østvold. Debugging method names. In Proceedings of the 23rd European Conference on Object-Oriented Programming, pages 294–317, 2009.

Digital Library

[16]

E. W. Høst and B. M. Østvold. The Java programmer’s phrase book. In Proceedings of the 1st International Conference on Software Language Engineering, pages 322–341, 2009.

Digital Library

[17]

M. J. Howard, S. Gupta, L. Pollock, and K. Vijay-Shanker. Automatically mining software-based, semantically-similar words from comment-code mappings. In Proceedings of the 10th Working Conference on Mining Software Repositories, pages 377–386, 2013.

Digital Library

[18]

M. Kamimura and G. Murphy. Towards generating human-oriented summaries of unit test cases. In Proceedings of the 21st IEEE International Conference on Program Comprehension, pages 215–218, 2013.

[19]

A. J. Ko, B. A. Myers, M. J. Coblenz, and H. H. Aung. An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Transactions on Software Engineering, 32 (12):971–987, 2006.

Digital Library

[20]

B. Li, C. Vendome, M. Linares-Vasquez, D. Poshyvanyk, and N. Kraft. Automatically documenting unit test cases. In IEEE International Conference on Software Testing, Verification and Validation, pages 341–352, 2016.

[21]

N. Madani, L. Guerrouj, M. Di Penta, Y. Gueheneuc, and G. Antoniol. Recognizing words from source code identifiers using speech recognition techniques. In Proceedings of the 14th European Conference on Software Maintenance and Reengineering, pages 68–77, 2010.

Digital Library

[22]

P. W. McBurney and C. McMillan. Automatic documentation generation via source code summarization of method context. In Proceedings of the 22nd International Conference on Program Comprehension, pages 279–290, 2014.

Digital Library

[23]

L. Moreno, J. Aponte, G. Sridhara, A. Marcus, L. Pollock, and K. Vijay-Shanker. Automatic generation of natural language summaries for Java classes. In Proceedings of the 21st IEEE International Conference on Program Comprehension, pages 23–32, 2013.

[24]

Y. Oda, H. Fudaba, G. Neubig, H. Hata, S. Sakti, T. Toda, and S. Nakamura. Learning to generate pseudocode from source code using statistical machine translation. In 30th IEEE/ACM International Conference on Automated Software Engineering, pages 574–584, 2015.

[25]

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages 311– 318, 2002.

Digital Library

[26]

G. Sridhara, E. Hill, L. Pollock, and K. Vijay-Shanker. Identifying word relations in software: A comparative study of semantic similarity tools. In Proceedings of the 16th IEEE International Conference on Program Comprehension, pages 123–132, 2008.

Digital Library

[27]

G. Sridhara, E. Hill, D. Muppaneni, L. Pollock, and K. Vijay-Shanker. Towards automatically generating summary comments for Java methods. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pages 43–52, 2010.

Digital Library

[28]

G. Sridhara, L. Pollock, and K. Vijay-Shanker. Automatically detecting and describing high level actions within methods. In Proceedings of the 33rd ACM/IEEE International Conference on Software Engineering, pages 101–110, 2011.

Digital Library

[29]

G. Sridhara, L. Pollock, and K. Vijay-Shanker. Generating parameter comments and integrating with method summaries. In Proceedings of the 19th IEEE International Conference on Program Comprehension, pages 71–80, 2011.

Digital Library

[30]

T. Suzuki, K. Sakamoto, F. Ishikawa, and S. Honiden. An approach for evaluating and suggesting method names using n-gram models. In Proceedings of the 22nd International Conference on Program Comprehension, pages 271–274, 2014.

Digital Library

[31]

A. Trenk. Testing on the toilet: Writing descriptive test names. http://googletesting.blogspot.com/2014/ 10/testing-on-toilet-writing-descriptive.html, 2015.

[32]

J. Yang and L. Tan. Inferring semantically related words from software context. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories (MSR ’12), pages 161–170, 2012.

Digital Library

[33]

A. T. T. Ying and M. P. Robillard. Code fragment summarization. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pages 655–658, 2013.

Digital Library

[34]

B. Zhang, E. Hill, and J. Clause. Automatically generating test templates from test names. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, pages 506–511, 2015.

Digital Library

[35]

H. Zhong and Z. Su. Detecting API documentation errors. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, pages 803–816, 2013.

Digital Library

Cited By

He YHuang JYu HXie T(2024)An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion GenerationProceedings of the ACM on Software Engineering10.1145/36607851:FSE(1750-1771)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660785
Deljouyi ARoychoudhury APaiva AAbreu RStorey M(2024)Understandable Test Generation Through Capture/Replay and LLMsProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3639789(261-263)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3639789
Winkler DUrbanke PRamler R(2024)Investigating the readability of test codeEmpirical Software Engineering10.1007/s10664-023-10390-z29:2Online publication date: 26-Feb-2024
https://doi.org/10.1007/s10664-023-10390-z
Show More Cited By

Index Terms

Towards automatically generating descriptive names for unit tests
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. Software maintenance tools

Recommendations

Generating unit tests with descriptive names or: would you name your children thing1 and thing2?
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

The name of a unit test helps developers to understand the purpose and scenario of the test, and test names support developers when navigating amongst sets of unit tests. When unit tests are generated automatically, however, they tend to be given non-...
Providing Developer-Approved Descriptive Names for Unit Tests
Generating parameterized unit tests
ISSTA '11: Proceedings of the 2011 International Symposium on Software Testing and Analysis

State-of-the art techniques for automated test generation focus on generating executions that cover program behavior. As they do not generate oracles, it is up to the developer to figure out what a test does and how to check the correctness of the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering

August 2016

899 pages

ISBN:9781450338455

DOI:10.1145/2970276

General Chair:
David Lo
Singapore Management University, Singapore
,
Program Chairs:
Sven Apel
University of Passau, Germany
,
Sarfraz Khurshid
University of Texas at Austin, USA

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence
SIGSOFT: ACM Special Interest Group on Software Engineering
IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ASE'16

Sponsor:

SIGAI
SIGSOFT
IEEE-CS

ASE'16: ACM/IEEE International Conference on Automated Software Engineering

September 3 - 7, 2016

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
782
Total Downloads

Downloads (Last 12 months)76
Downloads (Last 6 weeks)11

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

He YHuang JYu HXie T(2024)An Empirical Study on Focal Methods in Deep-Learning-Based Approaches for Assertion GenerationProceedings of the ACM on Software Engineering10.1145/36607851:FSE(1750-1771)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660785
Deljouyi ARoychoudhury APaiva AAbreu RStorey M(2024)Understandable Test Generation Through Capture/Replay and LLMsProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3639789(261-263)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3639789
Winkler DUrbanke PRamler R(2024)Investigating the readability of test codeEmpirical Software Engineering10.1007/s10664-023-10390-z29:2Online publication date: 26-Feb-2024
https://doi.org/10.1007/s10664-023-10390-z
Wu JClause J(2023)Automated Identification of Uniqueness in JUnit TestsACM Transactions on Software Engineering and Methodology10.1145/353331332:1(1-32)Online publication date: 13-Feb-2023
https://dl.acm.org/doi/10.1145/3533313
Bansal AEberhart ZKaras ZHuang YMcMillan C(2023)Function Call Graph Context Encoding for Neural Source Code SummarizationIEEE Transactions on Software Engineering10.1109/TSE.2023.327977449:9(4268-4281)Online publication date: Sep-2023
https://doi.org/10.1109/TSE.2023.3279774
Deljouyi AZaidman A(2023)Generating Understandable Unit Tests through End-to-End Test Scenario Carving2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM59687.2023.00021(107-118)Online publication date: 2-Oct-2023
https://doi.org/10.1109/SCAM59687.2023.00021
Takebayashi TPeruma AMkaouer MNewman C(2023)An Exploratory Study on the Usage and Readability of Messages Within Assertion Methods of Test Cases2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE)10.1109/NLBSE59153.2023.00015(32-39)Online publication date: May-2023
https://doi.org/10.1109/NLBSE59153.2023.00015
Haque SBansal AMcMillan C(2023)Label Smoothing Improves Neural Source Code Summarization2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)10.1109/ICPC58990.2023.00025(101-112)Online publication date: May-2023
https://doi.org/10.1109/ICPC58990.2023.00025
Brandt CZaidman A(2022)How Does This New Developer Test Fit In? A Visualization to Understand Amplified Test Cases2022 Working Conference on Software Visualization (VISSOFT)10.1109/VISSOFT55257.2022.00011(17-28)Online publication date: Oct-2022
https://doi.org/10.1109/VISSOFT55257.2022.00011
Newman CDecker MAlsuhaibani RPeruma AMkaouer MMohapatra SVishnoi TZampieri MSheldon THill E(2022)An Ensemble Approach for Annotating Source Code Identifiers With Part-of-Speech TagsIEEE Transactions on Software Engineering10.1109/TSE.2021.309824248:9(3506-3522)Online publication date: 1-Sep-2022
https://doi.org/10.1109/TSE.2021.3098242
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents