Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1858996.1859005acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Automatically documenting program changes

Published: 20 September 2010 Publication History

Abstract

Source code modifications are often documented with log messages. Such messages are a key component of software maintenance: they can help developers validate changes, locate and triage defects, and understand modifications. However, this documentation can be burdensome to create and can be incomplete or inaccurate.
We present an automatic technique for synthesizing succinct human-readable documentation for arbitrary program differences. Our algorithm is based on a combination of symbolic execution and a novel approach to code summarization. The documentation it produces describes the effect of a change on the runtime behavior of a program, including the conditions under which program behavior changes and what the new behavior is.
We compare our documentation to 250 human-written log messages from 5 popular open source projects. Employing a human study, we find that our generated documentation is suitable for supplementing or replacing 89% of existing log messages that directly describe a code change.

References

[1]
}}T. Apiwattanapong, A. Orso, and M. J. Harrold. Jdiff: A differencing technique and tool for object-oriented programs. Automated Software Engg., 14(1):3--36, 2007.
[2]
}}T. Ball and J. R. Larus. Efficient path profiling. In International Symposium on Microarchitecture, pages 46--57, 1996.
[3]
}}D. Binkley, R. Capellini, R. Raszewski, and C. Smith. An implementation of and experiment with semantic differencing. In International Conference on Software Maintenance, page 82, 2001.
[4]
}}C. Bird, A. Bachmann, E. Aune, J. Duffy, A. Bernstein, V. Filkov, and P. T. Devanbu. Fair and balanced?: bias in bug-fix datasets. In Foundations of Software Engineering, pages 121--130, 2009.
[5]
}}R. P. L. Buse and W. Weimer. Automatic documentation inference for exceptions. In International Symposium on Software Testing and Analysis, pages 273--282, 2008.
[6]
}}R. P. L. Buse and W. R. Weimer. A metric for software readability. In International Symposium on Software Testing and Analysis, pages 121--130, 2008.
[7]
}}D. Cai, X. He, J. Wen, and W. Ma. Block-level link analysis. SIGIR Research and development in information retrieval, pages 440--447, 2004.
[8]
}}C. Cardie and K. Wagstaff. Noun phrase coreference as clustering. In Joint Conference on Empirical Methods in NLP and Very Large Corpora, pages 82--89, 1999.
[9]
}}L. Carter, B. Simon, B. Calder, L. Carter, and J. Ferrante. Path analysis and renaming for predicated instruction scheduling. International Journal of Parallel Programming, 28(6):563--588, 2000.
[10]
}}S. Comai, S. Marrara, and L. Tanca. XML document summarization: Using XQuery for synopsis creation. In Database and Expert Systems Applications, pages 928--932, 2004.
[11]
}}M. Das, S. Lerner, and M. Seigle. ESP: path-sensitive program verification in polynomial time. SIGPLAN Notices, 37(5):57--68, 2002.
[12]
}}S. C. B. de Souza, N. Anquetil, and K. M. de Oliveira. A study of the documentation essential to software maintenance. In International Conference on Design of Communication, pages 68--75, 2005.
[13]
}}D. R. Engler, D. Y. Chen, and A. Chou. Bugs as inconsistent behavior: A general approach to inferring errors in systems code. In Symposium on Operating Systems Principles, pages 57--72, 2001.
[14]
}}A. M. Greg Kroah-Hartman, Jonathan Corbet. Linux kernel development. The Linux Foundation, 2009.
[15]
}}P. Hallam. What do programmers really do anyway? In Microsoft Developer Network (MSDN) - C# Compiler, Jan 2006.
[16]
}}K. J. Hoffman, P. Eugster, and S. Jagannathan. Semantics-aware trace analysis. SIGPLAN Not., 44(6):453--464, 2009.
[17]
}}R. Jhala and R. Majumdar. Path slicing. In Programming Language Design and Implementation, pages 38--47, 2005.
[18]
}}M. Kim and D. Notkin. Discovering and representing systematic code changes. In International Conference on Software Engineering, pages 309--319, 2009.
[19]
}}C. Lee, M. Kan, and S. Lai. Stylistic and lexical cotraining for web block classification. In Workshop on Web information and data management, pages 136--143, 2004.
[20]
}}C.-Y. Lin and F. J. Och. Looking for a few good metrics: Rouge and its evaluation. In NTCIR Workshop, 2004.
[21]
}}H. P. Luhn. The automatic creation of literature abstracts. IBM Journal of Research and Development, 2(2):159--165, 1958.
[22]
}}B. A. Mathis, J. E. Rush, and C. E. Young. Improvement of automatic abstracts by the use of structural analysis. Journal of the American Society for Information Science, 24(2):101--109, 1973.
[23]
}}J. F. Mccarthy and W. G. Lehnert. Using decision trees for coreference resolution. In Joint Conference on Artificial Intelligence, pages 1050--1055, 1995.
[24]
}}A. Mockus and L. Votta. Identifying reasons for software changes using historic databases. In International Conference on Software Maintenance, pages 120--130, 2000.
[25]
}}D. G. Novick and K. Ward. What users say they want in documentation. In Conference on Design of Communication, pages 84--91, 2006.
[26]
}}S. L. Peeger. Software Engineering: Theory and Practice. Prentice Hall, NJ, USA, 2001.
[27]
}}T. M. Pigoski. Practical Software Maintenance: Best Practices for Managing Your Software Investment. John Wiley & Sons, Inc., 1996.
[28]
}}R. Purushothaman and D. E. Perry. Toward understanding the rhetoric of small source code changes. IEEE Trans. Softw. Eng., 31(6):511--526, 2005.
[29]
}}T. Robschink and G. Snelting. Efficient path conditions in dependence graphs. In International Conference on Software Engineering, pages 478--488, 2002.
[30]
}}M. J. Rochkind. The source code control system. IEEE Trans. Software Eng., 1(4):364--370, 1975.
[31]
}}R. C. Seacord, D. Plakosh, and G. A. Lewis. Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices. Addison-Wesley Longman, MA, USA, 2003.
[32]
}}E. Soechting, K. Dobolyi, and W. Weimer. Syntactic regression testing for tree-structured output. International Symposium on Web Systems Evolution, September 2009.
[33]
}}R. Song, H. Liu, J. Wen, and W. Ma. Learning block importance models for web pages. In International World Wide Web Conference, pages 203--211, 2004.
[34]
}}W. M. Soon, H. T. Ng, and D. C. Y. Lim. A machine learning approach to coreference resolution of noun phrases. Comput. Linguist., 27(4):521--544, 2001.
[35]
}}S. E. Stemler. A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability. Practical Assessment, Research and Evaluation, 9(4), 2004.
[36]
}}R. Varadarajan and V. Hristidis. A system for query-specific document summarization. In Information and knowledge management, pages 622--631, 2006.
[37]
}}W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In International Conference on Software Engineering, pages 364--367, 2009.

Cited By

View all
  • (2024)Vision Paper: Proof-Carrying Code CompletionsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops10.1145/3691621.3694932(35-42)Online publication date: 27-Oct-2024
  • (2024)An Empirical Study on Learning-based Techniques for Explicit and Implicit Commit Messages GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695025(544-556)Online publication date: 27-Oct-2024
  • (2024)Understanding Code Changes Practically with Small-Scale Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694999(216-228)Online publication date: 27-Oct-2024
  • Show More Cited By

Index Terms

  1. Automatically documenting program changes

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ASE '10: Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering
      September 2010
      534 pages
      ISBN:9781450301169
      DOI:10.1145/1858996
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      • IEEE CS

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 September 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. code summarization
      2. commit messages
      3. differencing

      Qualifiers

      • Research-article

      Conference

      ASE10
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 82 of 337 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)75
      • Downloads (Last 6 weeks)14
      Reflects downloads up to 09 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Vision Paper: Proof-Carrying Code CompletionsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering Workshops10.1145/3691621.3694932(35-42)Online publication date: 27-Oct-2024
      • (2024)An Empirical Study on Learning-based Techniques for Explicit and Implicit Commit Messages GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695025(544-556)Online publication date: 27-Oct-2024
      • (2024)Understanding Code Changes Practically with Small-Scale Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694999(216-228)Online publication date: 27-Oct-2024
      • (2024)Commit Message Generation via ChatGPT: How Far Are We?Proceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering10.1145/3650105.3652300(124-129)Online publication date: 14-Apr-2024
      • (2024)ESGen: Commit Message Generation Based on Edit Sequence of Code ChangeProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644414(112-124)Online publication date: 15-Apr-2024
      • (2024)Only diff Is Not Enough: Generating Commit Messages Leveraging Reasoning and Action of Large Language ModelProceedings of the ACM on Software Engineering10.1145/36437601:FSE(745-766)Online publication date: 12-Jul-2024
      • (2024)KADEL: Knowledge-Aware Denoising Learning for Commit Message GenerationACM Transactions on Software Engineering and Methodology10.1145/364367533:5(1-32)Online publication date: 4-Jun-2024
      • (2024)Barriers for Students During Code Change ComprehensionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639227(1-13)Online publication date: 20-May-2024
      • (2024)Automatic Commit Message Generation: A Critical Review and Directions for Future WorkIEEE Transactions on Software Engineering10.1109/TSE.2024.336467550:4(816-835)Online publication date: 12-Feb-2024
      • (2024)Richen: Automated enrichment of Git documentation with usage examples and scenariosJournal of Software: Evolution and Process10.1002/smr.2662Online publication date: 13-Mar-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media