Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

KADEL: Knowledge-Aware Denoising Learning for Commit Message Generation

Published: 04 June 2024 Publication History

Abstract

Commit messages are natural language descriptions of code changes, which are important for software evolution such as code understanding and maintenance. However, previous methods are trained on the entire dataset without considering the fact that a portion of commit messages adhere to good practice (i.e., good-practice commits), while the rest do not. On the basis of our empirical study, we discover that training on good-practice commits significantly contributes to the commit message generation. Motivated by this finding, we propose a novel knowledge-aware denoising learning method called KADEL. Considering that good-practice commits constitute only a small proportion of the dataset, we align the remaining training samples with these good-practice commits. To achieve this, we propose a model that learns the commit knowledge by training on good-practice commits. This knowledge model enables supplementing more information for training samples that do not conform to good practice. However, since the supplementary information may contain noise or prediction errors, we propose a dynamic denoising training method. This method composes a distribution-aware confidence function and a dynamic distribution list, which enhances the effectiveness of the training process. Experimental results on the whole MCMD dataset demonstrate that our method overall achieves state-of-the-art performance compared with previous methods.

References

[1]
Eric Arazo, Diego Ortego, Paul Albert, Noel E. O’Connor, and Kevin McGuinness. 2019. Unsupervised label noise modeling and loss correction. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). Proceedings of Machine Learning Research, Vol. 97, PMLR, 312–321. Retrieved from http://proceedings.mlr.press/v97/arazo19a.html
[2]
Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization@ACL 2005, Jade Goldstein, Alon Lavie, Chin-Yew Lin, and Clare R. Voss (Eds.). Association for Computational Linguistics, 65–72. Retrieved from https://aclanthology.org/W05-0909/
[3]
Jacob G. Barnett, Charles K. Gathuru, Luke S. Soldano, and Shane McIntosh. 2016. The relationship between commit message detail and defect proneness in Java projects on GitHub. In Proceedings of the 13th International Conference on Mining Software Repositories, MSR 2016, Miryung Kim, Romain Robbes, and Christian Bird (Eds.). ACM, 496–499. DOI:
[4]
Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz, and Yejin Choi. 2019. COMET: Commonsense transformers for automatic knowledge graph construction. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, 4762–4779. DOI:
[5]
Bram Bulté and Arda Tezcan. 2019. Neural fuzzy repair: Integrating fuzzy matches into neural machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, 1800–1809. DOI:
[6]
Raymond P. L. Buse and Westley Weimer. 2010. Automatically documenting program changes. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering, Charles Pecheur, Jamie Andrews, and Elisabetta Di Nitto (Eds.). ACM, 33–42. DOI:
[7]
Casey Casalnuovo, Yagnik Suchak, Baishakhi Ray, and Cindy Rubio-González. 2017. GitcProc: A tool for processing and classifying GitHub commits. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Tevfik Bultan and Koushik Sen (Eds.). ACM, 396–399. DOI:
[8]
Shuang Chen, Jinpeng Wang, Xiaocheng Feng, Feng Jiang, Bing Qin, and Chin-Yew Lin. 2019. Enhancing neural data-to-text generation models with external background knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19), Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, 3020–3030. DOI:
[9]
Luis Fernando Cortes-Coy, Mario Linares Vásquez, Jairo Aponte, and Denys Poshyvanyk. 2014. On automatically generating commit messages via summarization of source code changes. In Proceedings of the 14th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2014. IEEE Computer Society, 275–284. DOI:
[10]
Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc.: Ser. B. (Methodol.) 39, 1 (1977), 1–22.
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the NAACL-HLT (1). Association for Computational Linguistics, 4171–4186.
[12]
Jinhao Dong, Yiling Lou, Qihao Zhu, Zeyu Sun, Zhilin Li, Wenjie Zhang, and Dan Hao. 2022. FIRA: Fine-grained graph-based code change representation for automated commit message generation. In Proceedings of the 44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022. ACM, 970–981. DOI:
[13]
Aleksandra Eliseeva, Yaroslav Sokolov, Egor Bogomolov, Yaroslav Golubev, Danny Dig, and Timofey Bryksin. 2023. From commit message generation to history-aware commit message completion. In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023. IEEE, 723–735. DOI:
[14]
Mingyang Geng, Shangwen Wang, Dezun Dong, Haotian Wang, Ge Li, Zhi Jin, Xiaoguang Mao, and Xiangke Liao. 2024. Large language models are few-shot summarizers: Multi-intent comment generation via in-context learning. In Proceedings of the International Conference on Software Engineering. ACM. DOI:
[15]
Bo Han, Quanming Yao, Tongliang Liu, Gang Niu, Ivor W. Tsang, James T. Kwok, and Masashi Sugiyama. 2020. A survey of label-noise representation learning: Past, present and future. arXiv:2011.04406. Retrieved from https://arxiv.org/abs/2011.04406
[16]
Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor W. Tsang, and Masashi Sugiyama. 2018. Co-teaching: Robust training of deep neural networks with extremely noisy labels. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 (NeurIPS’18), Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). NeurIPS Foundation, 8536–8546. Retrieved from https://proceedings.neurips.cc/paper/2018/hash/a19744e268754fb0148b017647355b7b-Abstract.html
[17]
Yichen He, Liran Wang, Kaiyi Wang, Yupeng Zhang, Hang Zhang, and Zhoujun Li. 2023. COME: Commit message generation with modification embedding. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 792–803.
[18]
Dan Hendrycks, Mantas Mazeika, Duncan Wilson, and Kevin Gimpel. 2018. Using trusted data to train deep networks on labels corrupted by severe noise. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 (NeurIPS’18), Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). NeurIPS Foundation, 10477–10486. Retrieved from https://proceedings.neurips.cc/paper/2018/hash/ad554d8c3b06d6b97ee76a2448bd7913-Abstract.html
[19]
Abram Hindle, Daniel M. Germán, Michael W. Godfrey, and Richard C. Holt. 2009. Automatic classication of large changes into maintenance categories. In Proceedings of the 17th IEEE International Conference on Program Comprehension, ICPC 2009. IEEE Computer Society, 30–39. DOI:
[20]
Xing Hu, Qiuyuan Chen, Haoye Wang, Xin Xia, David Lo, and Thomas Zimmermann. 2022. Correlating automated and human evaluation of code documentation generation quality. ACM Trans. Softw. Eng. Methodol. 31, 4 (2022), 63:1–63:28. DOI:
[21]
Yuan Huang, Nan Jia, Hao-Jie Zhou, Xiangping Chen, Zibin Zheng, and Mingdong Tang. 2020. Learning human-written commit messages to document code changes. J. Comput. Sci. Technol. 35, 6 (2020), 1258–1277. DOI:
[22]
Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, and Li Fei-Fei. 2018. MentorNet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In Proceedings of the 35th International Conference on Machine Learning (ICML’18), Jennifer G. Dy and Andreas Krause (Eds.). Proceedings of Machine Learning Research, Vol. 80, PMLR, 2309–2318. Retrieved from http://proceedings.mlr.press/v80/jiang18c.html
[23]
Shuyao Jiang. 2019. Boosting neural commit message generation with code semantic analysis. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019. IEEE, 1280–1282. DOI:
[24]
Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Automatically generating commit messages from diffs using neural machine translation. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17), Grigore Rosu, Massimiliano Di Penta, and Tien N. Nguyen (Eds.). IEEE Computer Society, 135–146. DOI:
[25]
Maurice G. Kendall. 1945. The treatment of ties in ranking problems. Biometrika 33, 3 (1945), 239–251.
[26]
Patrick S. H. Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). NeurIPS Foundation, Retrieved from https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html
[27]
Jiawei Li and Iftekhar Ahmed. 2023. Commit message matters: Investigating impact and evolution of commit message quality. In Proceedings of the 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023. IEEE, 806–817. DOI:
[28]
Lin Chin-Yew. 2004. ROUGE: A package for automatic evaluation of summaries. In Proceedings of the Text Summarization Branches Out. ACL.
[29]
Qin Liu, Zihe Liu, Hongming Zhu, Hongfei Fan, Bowen Du, and Yu Qian. 2019. Generating commit messages from diffs using pointer-generator network. In Proceedings of the 16th International Conference on Mining Software Repositories, MSR 2019, Margaret-Anne D. Storey, Bram Adams, and Sonia Haiduc (Eds.). IEEE / ACM, 299–309. DOI:
[30]
Shangqing Liu, Cuiyun Gao, Sen Chen, Lun Yiu Nie, and Yang Liu. 2022. ATOM: Commit message generation based on abstract syntax tree and hybrid ranking. IEEE Trans. Software Eng. 48, 5 (2022), 1800–1817. DOI:
[31]
Tongliang Liu and Dacheng Tao. 2016. Classification with noisy labels by importance reweighting. IEEE Trans. Pattern Anal. Mach. Intell. 38, 3 (2016), 447–461. DOI:
[32]
Zhongxin Liu, Xin Xia, Ahmed E. Hassan, David Lo, Zhenchang Xing, and Xinyu Wang. 2018. Neural-machine-translation-based commit message generation: How far are we? In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Marianne Huchard, Christian Kästner, and Gordon Fraser (Eds.). ACM, 373–384. DOI:
[33]
Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations, ICLR 2019. OpenReview.net. Retrieved from https://openreview.net/forum?id=Bkg6RiCqY7
[34]
Pablo Loyola, Edison Marrese-Taylor, Jorge A. Balazs, Yutaka Matsuo, and Fumiko Satoh. 2018. Content aware source code change description generation. In Proceedings of the 11th International Conference on Natural Language Generation, Emiel Krahmer, Albert Gatt, and Martijn Goudbeek (Eds.). Association for Computational Linguistics, 119–128. DOI:
[35]
Pablo Loyola, Edison Marrese-Taylor, and Yutaka Matsuo. 2017. A neural architecture for generating natural language descriptions from source code changes. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Volume 2: Short Papers, Regina Barzilay and Min-Yen Kan (Eds.). Association for Computational Linguistics, 287–292. DOI:
[36]
Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin B. Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie Liu. 2021. CodeXGLUE: A machine learning benchmark dataset for code understanding and generation. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, Joaquin Vanschoren and Sai-Kit Yeung (Eds.). NeurIPS Foundation, Retrieved from https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/c16a5320fa475530d9583c34fd356ef5-Abstract-round1.html
[37]
Shangwen Lv, Fuqing Zhu, and Songlin Hu. 2020. Integrating external event knowledge for script learning. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Donia Scott, Núria Bel, and Chengqing Zong (Eds.). International Committee on Computational Linguistics, 306–315. DOI:
[38]
Wei Ma, Shangqing Liu, Wenhan Wang, Qiang Hu, Ye Liu, Cen Zhang, Liming Nie, and Yang Liu. 2023. The scope of ChatGPT in software engineering: A thorough investigation. arXiv:2305.12138. Retrieved from https://arxiv.org/abs/2305.12138
[39]
Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, and Shin Ishii. 2019. Virtual adversarial training: A regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 41, 8 (2019), 1979–1993. DOI:
[40]
Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori L. Pollock, and K. Vijay-Shanker. 2013. Automatic generation of natural language summaries for Java classes. In Proceedings of the IEEE 21st International Conference on Program Comprehension, ICPC 2013. IEEE Computer Society, 23–32. DOI:
[41]
Lun Yiu Nie, Cuiyun Gao, Zhicong Zhong, Wai Lam, Yang Liu, and Zenglin Xu. 2021. CoreGen: Contextualized code representation learning for commit message generation. Neurocomputing 459 (2021), 97–107. DOI:
[42]
OpenAI. 2022. Introducing ChatGPT. Technical Report. OpenAI. [Online]. Retrieved from https://openai.com/blog/chatgpt
[43]
Sebastiano Panichella, Annibale Panichella, Moritz Beller, Andy Zaidman, and Harald C. Gall. 2016. The impact of test case summaries on bug fixing performance: An empirical investigation. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Laura K. Dillon, Willem Visser, and Laurie A. Williams (Eds.). ACM, 547–558. DOI:
[44]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. ACL, 311–318. DOI:
[45]
Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. 2017. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. IEEE Computer Society, 2233–2241. DOI:
[46]
Alec Radford, Narasimhan Karthik, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018).
[47]
Soumaya Rebai, Marouane Kessentini, Vahid Alizadeh, Oussama Ben Sghaier, and Rick Kazman. 2020. Recommending refactorings via commit message analysis. Inf. Softw. Technol. 126 (2020), 106332. DOI:
[48]
Maarten Sap, Ronan Le Bras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A. Smith, and Yejin Choi. 2019. ATOMIC: An atlas of machine commonsense for if-then reasoning. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, AAAI 2019. AAAI Press, 3027–3035. DOI:
[49]
Yutaka Sasaki. 2007. The truth of the F-measure. Teach Tutor Mater 1, 5 (2007), 1–5.
[50]
Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Volume 1: Long Papers, Regina Barzilay and Min-Yen Kan (Eds.). Association for Computational Linguistics, 1073–1083. DOI:
[51]
Ensheng Shi, Yanlin Wang, Wei Tao, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, and Hongbin Sun. 2022. RACE: Retrieval-augmented commit message generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, 5520–5530. Retrieved from https://aclanthology.org/2022.emnlp-main.372
[52]
Jessica Shieh. 2023. Best practices for prompt engineering with OpenAI API. OpenAI, February. Retrieved Nov 10, 2023 from https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api
[53]
Wei Tao, Yanlin Wang, Ensheng Shi, Lun Du, Shi Han, Hongyu Zhang, Dongmei Zhang, and Wenqiang Zhang. 2021. On the evaluation of commit message generation models: An experimental study. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution, ICSME 2021. IEEE, 126–136. DOI:
[54]
Wei Tao, Yanlin Wang, Ensheng Shi, Lun Du, Shi Han, Hongyu Zhang, Dongmei Zhang, and Wenqiang Zhang. 2022. A large-scale empirical study of commit message generation: Models, datasets and evaluation. Empir. Softw. Eng. 27, 7 (2022), 198. DOI:
[55]
Yingchen Tian, Yuxia Zhang, Klaas-Jan Stol, Lin Jiang, and Hui Liu. 2022. What makes a good commit message? In Proceedings of the 44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022. ACM, 2389–2401. DOI:
[56]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and efficient foundation language models. arXiv:2302.13971. Retrieved from https://arxiv.org/abs/2302.13971
[57]
Chris van der Lee, Albert Gatt, Emiel van Miltenburg, Sander Wubben, and Emiel Krahmer. 2019. Best practices for the human evaluation of automatically generated text. In Proceedings of the 12th International Conference on Natural Language Generation, INLG 2019, Kees van Deemter, Chenghua Lin, and Hiroya Takamura (Eds.). Association for Computational Linguistics, 355–368. DOI:
[58]
Mario Linares Vásquez, Luis Fernando Cortes-Coy, Jairo Aponte, and Denys Poshyvanyk. 2015. ChangeScribe: A tool for automatically generating commit messages. In Proceedings of the 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Antonia Bertolino, Gerardo Canfora, and Sebastian G. Elbaum (Eds.). Vol. 2, IEEE Computer Society, 709–712. DOI:
[59]
Jesse Vig. 2019. A multiscale visualization of attention in the transformer model. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics, Florence, Italy, 37–42. DOI:
[60]
Chenglin Wang, Yucheng Zhou, Guodong Long, Xiaodong Wang, and Xiaowei Xu. 2022. Unsupervised knowledge graph construction and event-centric knowledge infusion for scientific NLI. arXiv:2210.15248. Retrieved from https://arxiv.org/abs/2210.15248
[61]
Haoye Wang, Xin Xia, David Lo, Qiang He, Xinyu Wang, and John Grundy. 2021. Context-aware retrieval-based deep commit message generation. ACM Trans. Softw. Eng. Methodol. 30, 4 (2021), 56:1–56:30. DOI:
[62]
Yue Wang, Weishi Wang, Shafiq R. Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 8696–8708. DOI:
[63]
Hongxin Wei, Lei Feng, Xiangyu Chen, and Bo An. 2020. Combating noisy labels by agreement: A joint training method with co-regularization. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020. Computer Vision Foundation / IEEE, 13723–13732. DOI:
[64]
Shengbin Xu, Yuan Yao, Feng Xu, Tianxiao Gu, Hanghang Tong, and Jian Lu. 2019. Commit message generation for source code changes. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019, Sarit Kraus (Ed.). ijcai.org, 3975–3981. DOI:
[65]
Xin Ye, Yongjie Zheng, Wajdi Aljedaani, and Mohamed Wiem Mkaouer. 2021. Recommending pull request reviewers based on code changes. Soft Comput. 25, 7 (2021), 5619–5632. DOI:
[66]
Zibin Zheng, Kaiwen Ning, Jiachi Chen, Yanlin Wang, Wenqing Chen, Lianghong Guo, and Weicheng Wang. 2023. Towards an understanding of large language models in software engineering tasks. arXiv:2308.11396. Retrieved from https://arxiv.org/abs/2308.11396
[67]
Zibin Zheng, Kaiwen Ning, Yanlin Wang, Jingwen Zhang, Dewu Zheng, Mingxi Ye, and Jiachi Chen. 2023. A survey of large language models for code: Evolution, benchmarking, and future trends. arXiv:2311.10372. Retrieved from https://arxiv.org/abs/2311.10372
[68]
Yucheng Zhou, Xiubo Geng, Tao Shen, Jian Pei, Wenqiang Zhang, and Daxin Jiang. 2021. Modeling event-pair relations in external knowledge graphs for script reasoning. In Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Vol. ACL/IJCNLP 2021, Association for Computational Linguistics, 4586–4596. DOI:
[69]
Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Binxing Jiao, and Daxin Jiang. 2023. Towards robust ranker for text retrieval. In Findings of the Association for Computational Linguistics: ACL 2023, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, 5387–5401. DOI:

Cited By

View all
  • (2025)Automated description generation for software patchesInformation and Software Technology10.1016/j.infsof.2024.107543177(107543)Online publication date: Jan-2025

Index Terms

  1. KADEL: Knowledge-Aware Denoising Learning for Commit Message Generation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Software Engineering and Methodology
    ACM Transactions on Software Engineering and Methodology  Volume 33, Issue 5
    June 2024
    952 pages
    EISSN:1557-7392
    DOI:10.1145/3618079
    • Editor:
    • Mauro Pezzè
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 June 2024
    Online AM: 29 January 2024
    Accepted: 15 January 2024
    Revised: 29 November 2023
    Received: 20 July 2023
    Published in TOSEM Volume 33, Issue 5

    Check for updates

    Author Tags

    1. Commit message generation
    2. knowledge introducing
    3. denoising training

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)476
    • Downloads (Last 6 weeks)47
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Automated description generation for software patchesInformation and Software Technology10.1016/j.infsof.2024.107543177(107543)Online publication date: Jan-2025

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media