Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Context-aware Retrieval-based Deep Commit Message Generation

Published: 23 July 2021 Publication History

Abstract

Commit messages recorded in version control systems contain valuable information for software development, maintenance, and comprehension. Unfortunately, developers often commit code with empty or poor quality commit messages. To address this issue, several studies have proposed approaches to generate commit messages from commit diffs. Recent studies make use of neural machine translation algorithms to try and translate git diffs into commit messages and have achieved some promising results. However, these learning-based methods tend to generate high-frequency words but ignore low-frequency ones. In addition, they suffer from exposure bias issues, which leads to a gap between training phase and testing phase.
In this article, we propose CoRec to address the above two limitations. Specifically, we first train a context-aware encoder-decoder model that randomly selects the previous output of the decoder or the embedding vector of a ground truth word as context to make the model gradually aware of previous alignment choices. Given a diff for testing, the trained model is reused to retrieve the most similar diff from the training set. Finally, we use the retrieval diff to guide the probability distribution for the final generated vocabulary. Our method combines the advantages of both information retrieval and neural machine translation. We evaluate CoRec on a dataset from Liu et al. and a large-scale dataset crawled from 10K popular Java repositories in Github. Our experimental results show that CoRec significantly outperforms the state-of-the-art method NNGen by 19% on average in terms of BLEU.

References

[1]
2019. Git. Retrieved from https://git-scm.com/.
[2]
Edward Loper and Steven Bird. 2002. Nltk: The natural language toolkit. arXiv preprint cs/0205028.
[3]
2020. College English Test. Retrieved from https://en.wikipedia.org/wiki/College_English_Test.
[4]
Hervé Abdi. 2007. Bonferroni and Šidák corrections for multiple comparisons. Encyc. Meas. Statist. 3 (2007), 103–107.
[5]
John Aldrich et al. 1997. R. A. Fisher and the making of maximum likelihood 1912–1922. Statist. Sci. 12, 3 (1997), 162–176.
[6]
Philip Arthur, Graham Neubig, and Satoshi Nakamura. 2016. Incorporating discrete translation lexicons into neural machine translation. arXiv preprint arXiv:1606.02006 (2016).
[7]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
[8]
Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. 65–72.
[9]
Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer. 2015. Scheduled sampling for sequence prediction with recurrent neural networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 1171–1179.
[10]
Raymond P. L. Buse and Westley R. Weimer. 2010. Automatically documenting program changes. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering. 33–42.
[11]
Jiachi Chen, Xin Xia, David Lo, John Grundy, Xiapu Luo, and Ting Chen. 2020. Defining smart contract defects on ethereum. IEEE Trans. Softw. Eng. (2020), 1–1.
[12]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[13]
Luis Fernando Cortés-Coy, Mario Linares-Vásquez, Jairo Aponte, and Denys Poshyvanyk. 2014. On automatically generating commit messages via summarization of source code changes. In Proceedings of the IEEE 14th International Working Conference on Source Code Analysis and Manipulation. IEEE, 275–284.
[14]
Brian De Alwis and Jonathan Sillito. 2009. Why are software projects moving from centralized to decentralized version control systems? In Proceedings of the ICSE Workshop on Cooperative and Human Aspects on Software Engineering. IEEE, 36–39.
[15]
Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen. 2013. Boa: A language and infrastructure for analyzing ultra-large-scale software repositories. In Proceedings of the 35th International Conference on Software Engineering (ICSE). IEEE, 422–431.
[16]
M. Amin Farajian, Marco Turchi, Matteo Negri, and Marcello Federico. 2017. Multi-domain neural machine translation through unsupervised adaptation. In Proceedings of the 2nd Conference on Machine Translation. 127–137.
[17]
Joseph L. Fleiss. 1971. Measuring nominal scale agreement among many raters.Psychol. Bull. 76, 5 (1971), 378.
[18]
Kartik Goyal, Chris Dyer, and Taylor Berg-Kirkpatrick. 2017. Differentiable scheduled sampling for credit assignment. arXiv preprint arXiv:1704.06970 (2017).
[19]
Jiatao Gu, Yong Wang, Kyunghyun Cho, and Victor O. K. Li. 2018. Search engine guided neural machine translation. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
[20]
Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API learning. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 631–642.
[21]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computat. 9, 8 (1997), 1735–1780.
[22]
Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In Proceedings of the IEEE/ACM 26th International Conference on Program Comprehension (ICPC). IEEE, 200–20010.
[23]
Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2020. Deep code comment generation with hybrid lexical and syntactical information. Empir. Softw. Eng. 25, 3 (2020), 2179–2217.
[24]
Yuan Huang, Qiaoyang Zheng, Xiangping Chen, Yingfei Xiong, Zhiyong Liu, and Xiaonan Luo. 2017. Mining version control system for automatically generating commit comment. In Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, 414–423.
[25]
Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Automatically generating commit messages from diffs using neural machine translation. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 135–146.
[26]
Siyuan Jiang and Collin McMillan. 2017. Towards automatic generation of short summaries of commits. In Proceedings of the IEEE/ACM 25th International Conference on Program Comprehension (ICPC). IEEE, 320–323.
[27]
Łukasz Kaiser and Ilya Sutskever. 2015. Neural GPUs learn algorithms. arXiv preprint arXiv:1511.08228 (2015).
[28]
Mira Kajko-Mattsson. 2005. A survey of documentation practice within corrective maintenance. Empir. Softw. Eng. 10, 1 (2005), 31–55.
[29]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).
[30]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[31]
Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, and Alexander M. Rush. 2017. OpenNMT: Open-source toolkit for neural machine translation. arXiv preprint arXiv:1701.02810 (2017).
[32]
Alexander LeClair, Siyuan Jiang, and Collin McMillan. 2019. A neural model for generating natural language summaries of program subroutines. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 795–806.
[33]
Alexander LeClair and Collin McMillan. 2019. Recommendations for datasets for source code summarization. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 3931–3937.
[34]
Xiaoqing Li, Jiajun Zhang, and Chengqing Zong. 2016. One sentence one model for neural machine translation. arXiv preprint arXiv:1609.06490 (2016).
[35]
Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out, Post2Conference Workshop of ACL.
[36]
Mario Linares-Vásquez, Luis Fernando Cortés-Coy, Jairo Aponte, and Denys Poshyvanyk. 2015. ChangeScribe: A tool for automatically generating commit messages. In Proceedings of the IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 2. IEEE, 709–712.
[37]
Qin Liu, Zihe Liu, Hongming Zhu, Hongfei Fan, Bowen Du, and Yu Qian. 2019. Generating commit messages from diffs using pointer-generator network. In Proceedings of the IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 299–309.
[38]
Shangqing Liu, Cuiyun Gao, Sen Chen, Lun Yiu Nie, and Yang Liu. 2019. ATOM: Commit message generation based on abstract syntax tree and hybrid ranking. arXiv preprint arXiv:1912.02972 (2019).
[39]
Zhongxin Liu, Xin Xia, Ahmed E. Hassan, David Lo, Zhenchang Xing, and Xinyu Wang. 2018. Neural-machine-translation-based commit message generation: How far are we? In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 373–384.
[40]
Zhongxin Liu, Xin Xia, Christoph Treude, David Lo, and Shanping Li. 2019. Automatic generation of pull request descriptions. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 176–188.
[41]
Pablo Loyola, Edison Marrese-Taylor, Jorge Balazs, Yutaka Matsuo, and Fumiko Satoh. 2018. Content aware source code change description generation. In Proceedings of the 11th International Conference on Natural Language Generation. 119–128.
[42]
Pablo Loyola, Edison Marrese-Taylor, and Yutaka Matsuo. 2017. A neural architecture for generating natural language descriptions from source code changes. arXiv preprint arXiv:1704.04856 (2017).
[43]
Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015).
[44]
Walid Maalej and Hans-Jörg Happel. 2010. Can development work describe itself? In Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR). IEEE, 191–200.
[45]
Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Meeting of the Association for Computational Linguistics: System Demonstrations. 55–60.
[46]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[47]
Audris Mockus and Lawrence G. Votta. 2000. Identifying reasons for software changes using historic databases. In Proceedings of the International Conference on Software Maintenance. 120–130.
[48]
Graham Neubig, Zi-Yi Dou, Junjie Hu, Paul Michel, Danish Pruthi, Xinyi Wang, and John Wieting. 2019. compare-mt: A tool for holistic comparison of language generation systems. arXiv preprint arXiv:1903.07926 (2019).
[49]
Yusuf Sulistyo Nugroho, Hideaki Hata, and Kenichi Matsumoto. 2020. How different are different diff algorithms in git?Empir. Softw. Eng. 25, 1 (2020), 790–823.
[50]
Yusuke Oda, Hiroyuki Fudaba, Graham Neubig, Hideaki Hata, Sakriani Sakti, Tomoki Toda, and Satoshi Nakamura. 2015. Learning to generate pseudo-code from source code using statistical machine translation (t). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 574–584.
[51]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 311–318.
[52]
Marc’Aurelio Ranzato, Sumit Chopra, Michael Auli, and Wojciech Zaremba. 2015. Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732 (2015).
[53]
Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45, 11 (1997), 2673–2681.
[54]
Shikhar Sharma, Layla El Asri, Hannes Schulz, and Jeremie Zumer. 2017. Relevance of unsupervised metrics in task-oriented dialogue for evaluating natural language generation. CoRR abs/1706.09799 (2017).
[55]
Jinfeng Shen, Xiaobing Sun, Bin Li, Hui Yang, and Jiajun Hu. 2016. On automatic summarization of what and why information in source code changes. In Proceedings of the IEEE 40th Computer Software and Applications Conference (COMPSAC), Vol. 1. IEEE, 103–112.
[56]
Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu. 2015. Minimum risk training for neural machine translation. arXiv preprint arXiv:1512.02433 (2015).
[57]
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3104–3112.
[58]
Christoph Tillmann and Hermann Ney. 2003. Word reordering and a dynamic programming beam search algorithm for statistical machine translation. Computat. Ling. 29, 1 (2003), 97–133.
[59]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 5998–6008.
[60]
Zhiyuan Wan, Xin Xia, Ahmed E. Hassan, David Lo, Jianwei Yin, and Xiaohu Yang. 2020. Perceptions, expectations, and challenges in defect prediction. IEEE Trans. Softw. Eng. 46, 11 (2020), 1241–1266.
[61]
Zhiyuan Wan, Xin Xia, David Lo, and Gail C. Murphy. 2019. How does machine learning change software development practices?IEEE Trans. Softw. Eng. (2019), 1–1.
[62]
Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in Statistics. Springer, 196–202.
[63]
Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 3-4 (1992), 229–256.
[64]
Ronald J. Williams and David Zipser. 1989. A learning algorithm for continually running fully recurrent neural networks. Neur. Computat. 1, 2 (1989), 270–280.
[65]
Sam Wiseman and Alexander M. Rush. 2016. Sequence-to-sequence learning as beam-search optimization. arXiv preprint arXiv:1606.02960 (2016).
[66]
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey et al. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).
[67]
Bowen Xu, Zhenchang Xing, Xin Xia, and David Lo. 2017. AnswerBot: Automated generation of answer summary to developers’ technical questions. 706–716.
[68]
Shengbin Xu, Yuan Yao, Feng Xu, Tianxiao Gu, Hanghang Tong, and Jian Lu. 2019. Commit message generation for source code changes. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Vol. 7. 3975–3981.
[69]
Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, and Satoshi Nakamura. 2018. Guiding neural machine translation with retrieved translation pieces. arXiv preprint arXiv:1804.02559 (2018).
[70]
Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, and Xudong Liu. 2020. Retrieval-based neural source code summarization. In Proceedings of the 42nd International Conference on Software Engineering.
[71]
Wen Zhang, Yang Feng, Fandong Meng, Di You, and Qun Liu. 2019. Bridging the gap between training and inference for neural machine translation. arXiv preprint arXiv:1906.02448 (2019).

Cited By

View all
  • (2024)CCAF: Learning Code Change via AdapterFusionProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671399(219-228)Online publication date: 24-Jul-2024
  • (2024)Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644918(571-583)Online publication date: 2-Jul-2024
  • (2024)Only diff Is Not Enough: Generating Commit Messages Leveraging Reasoning and Action of Large Language ModelProceedings of the ACM on Software Engineering10.1145/36437601:FSE(745-766)Online publication date: 12-Jul-2024
  • Show More Cited By

Index Terms

  1. Context-aware Retrieval-based Deep Commit Message Generation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Software Engineering and Methodology
    ACM Transactions on Software Engineering and Methodology  Volume 30, Issue 4
    Continuous Special Section: AI and SE
    October 2021
    613 pages
    ISSN:1049-331X
    EISSN:1557-7392
    DOI:10.1145/3461694
    • Editor:
    • Mauro Pezzè
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 July 2021
    Accepted: 01 May 2021
    Revised: 01 March 2021
    Received: 01 May 2020
    Published in TOSEM Volume 30, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Commit message generation
    2. information retrieval
    3. neural machine translation

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Discovery Early Career Researcher Award (DECRA)
    • ARC Discovery
    • Key-Area Research and Development Program of Guangdong Province
    • ARC Laureate Fellowship
    • Key Research and Development Program of Zhejiang Province

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)133
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CCAF: Learning Code Change via AdapterFusionProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671399(219-228)Online publication date: 24-Jul-2024
    • (2024)Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based StudyProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644918(571-583)Online publication date: 2-Jul-2024
    • (2024)Only diff Is Not Enough: Generating Commit Messages Leveraging Reasoning and Action of Large Language ModelProceedings of the ACM on Software Engineering10.1145/36437601:FSE(745-766)Online publication date: 12-Jul-2024
    • (2024)KADEL: Knowledge-Aware Denoising Learning for Commit Message GenerationACM Transactions on Software Engineering and Methodology10.1145/364367533:5(1-32)Online publication date: 4-Jun-2024
    • (2024)Automatic Commit Message Generation: A Critical Review and Directions for Future WorkIEEE Transactions on Software Engineering10.1109/TSE.2024.336467550:4(816-835)Online publication date: 12-Feb-2024
    • (2024)Code-centric learning-based just-in-time vulnerability detectionJournal of Systems and Software10.1016/j.jss.2024.112014214(112014)Online publication date: Aug-2024
    • (2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
    • (2024)An exploratory study of software artifacts on GitHub from the lens of documentationInformation and Software Technology10.1016/j.infsof.2024.107425169:COnline publication date: 2-Jul-2024
    • (2024)Multilingual code refactoring detection based on deep learningExpert Systems with Applications10.1016/j.eswa.2024.125164258(125164)Online publication date: Dec-2024
    • (2023) LoGenText-Plus: Improving Neural Machine Translation Based Logging Texts Generation with Syntactic TemplatesACM Transactions on Software Engineering and Methodology10.1145/362474033:2(1-45)Online publication date: 22-Dec-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media