Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

On the Significance of Category Prediction for Code-Comment Synchronization

Published: 29 March 2023 Publication History

Editorial Notes

The authors have requested minor, non-substantive changes to the VoR and, in accordance with ACM policies, a Corrected Version of Record was published on May 11, 2023. For reference purposes, the VoR may still be accessed via the Supplemental Material section on this citation page.

Abstract

Software comments sometimes are not promptly updated in sync when the associated code is changed. The inconsistency between code and comments may mislead the developers and result in future bugs. Thus, studies concerning code-comment synchronization have become highly important, which aims to automatically synchronize comments with code changes. Existing code-comment synchronization approaches mainly contain two types, i.e., (1) deep learning-based (e.g., CUP), and (2) heuristic-based (e.g., HebCUP). The former constructs a neural machine translation-structured semantic model, which has a more generalized capability on synchronizing comments with software evolution and growth. However, the latter designs a series of rules for performing token-level replacements on old comments, which can generate the completely correct comments for the samples fully covered by their fine-designed heuristic rules. In this article, we propose a composite approach named CBS (i.e., Classifying Before Synchronizing) to further improve the code-comment synchronization performance, which combines the advantages of CUP and HebCUP with the assistance of inferred categories of Code-Comment Inconsistent (CCI) samples. Specifically, we firstly define two categories (i.e., heuristic-prone and non-heuristic-prone) for CCI samples and propose five features to assist category prediction. The samples whose comments can be correctly synchronized by HebCUP are heuristic-prone, while others are non-heuristic-prone. Then, CBS employs our proposed Multi-Subsets Ensemble Learning (MSEL) classification algorithm to alleviate the class imbalance problem and construct the category prediction model. Next, CBS uses the trained MSEL to predict the category of the new sample. If the predicted category is heuristic-prone, CBS employs HebCUP to conduct the code-comment synchronization for the sample, otherwise, CBS allocates CUP to handle it. Our extensive experiments demonstrate that CBS statistically significantly outperforms CUP and HebCUP, and obtains an average improvement of 23.47%, 22.84%, 3.04%, 3.04%, 1.64%, and 19.39% in terms of Accuracy, Recall@5, Average Edit Distance (AED), Relative Edit Distance (RED), BLEU-4, and Effective Synchronized Sample (ESS) ratio, respectively, which highlights that category prediction for CCI samples can boost the code-comment synchronization performance.

Supplementary Material

3534117-vor (3534117-vor.pdf)
Version of Record for "On the Significance of Category Prediction for Code-Comment Synchronization" by Yang et al., ACM Transactions on Software Engineering and Methodology, Volume 32, No. 2 (TOSEM 32:2).

References

[2]
[3]
2022. apache/hive: Apache Hive. https://github.com/apache/hive. (Accessed on 02/25/2022).
[4]
2022. Difflib – Helpers for Computing Deltas – Python 3.10.2 Documentation. https://docs.python.org/3/library/difflib.html. (Accessed on 03/07/2022).
[5]
2022. Facebook/fresco: An Android Library for Managing Images and the Memory They Use. https://github.com/facebook/fresco. (Accessed on 02/25/2022).
[6]
2022. GitHub. https://github.com/. (2022). (Accessed on 02/25/2022).
[7]
2022. Google/nomulus: Top-level Domain Name Registry Service on Google App Engine. https://github.com/google/nomulus. (Accessed on 02/25/2022).
[8]
2022. TensorFlow. https://www.tensorflow.org/. (Accessed on 02/26/2022).
[9]
Silvia Abrahao, Carmine Gravino, Emilio Insfran, Giuseppe Scanniello, and Genoveffa Tortora. 2012. Assessing the effectiveness of sequence diagrams in the comprehension of functional requirements: Results from a family of five experiments. IEEE Transactions on Software Engineering 39, 3 (2012), 327–342.
[10]
Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2020. A transformer-based approach for source code summarization. arXiv preprint arXiv:2005.00653 (2020).
[11]
Saad Albawi, Tareq Abed Mohammed, and Saad Al-Zawi. 2017. Understanding of a convolutional neural network. In 2017 International Conference on Engineering and Technology (ICET). IEEE, 1–6.
[12]
Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. 2018. code2seq: Generating sequences from structured representations of code. arXiv preprint arXiv:1808.01400 (2018).
[13]
William H. Beyer. 2019. Handbook of Tables for Probability and Statistics. CRC Press.
[14]
Xavier Bouthillier and Gaël Varoquaux. 2020. Survey of Machine-learning Experimental Methods at NeurIPS2019 and ICLR2020. Ph.D. Dissertation. Inria Saclay Ile de France.
[15]
Leo Breiman. 2001. Random forests. Machine Learning 45, 1 (2001), 5–32.
[16]
Leo Breiman, Jerome Friedman, Charles J. Stone, and Richard A. Olshen. 1984. Classification and Regression Trees. CRC Press.
[17]
Qiuyuan Chen, Xin Xia, Han Hu, David Lo, and Shanping Li. 2021. Why my code summarization model does not work: Code comment improvement with category prediction. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 2 (2021), 1–29.
[18]
Alfonso Cimasa, Anna Corazza, Carmen Coviello, and Giuseppe Scanniello. 2019. Word embeddings for comment coherence. In 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE, 244–251.
[19]
Anna Corazza, Valerio Maggio, and Giuseppe Scanniello. 2018. Coherence of comments and method implementations: A dataset and an empirical investigation. Software Quality Journal 26, 2 (2018), 751–777.
[20]
Sergio Cozzetti B. de Souza, Nicolas Anquetil, and Káthia M. de Oliveira. 2005. A study of the documentation essential to software maintenance. In Proceedings of the 23rd Annual International Conference on Design of Communication: Documenting & Designing for Pervasive Information. 68–75.
[21]
Thomas J. DiCiccio and Bradley Efron. 1996. Bootstrap confidence intervals. Statistical Science 11, 3 (1996), 189–228.
[22]
J. A. Ferreira, A. H. Zwinderman, et al. 2006. On the Benjamini–Hochberg method. Annals of Statistics 34, 4 (2006), 1827–1849.
[23]
Markus Freitag and Yaser Al-Onaizan. 2017. Beam search strategies for neural machine translation. arXiv preprint arXiv:1702.01806 (2017).
[24]
Cuiyun Gao, Wenjie Zhou, Xin Xia, David Lo, Qi Xie, and Michael R. Lyu. 2020. Automating app review response generation based on contextual knowledge. CoRR abs/2010.06301 (2020). arXiv:2010.06301https://arxiv.org/abs/2010.06301.
[25]
Verena Geist, Michael Moser, Josef Pichler, Stefanie Beyer, and Martin Pinzger. 2020. Leveraging machine learning for software redocumentation. In 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). 622–626.
[26]
Mingyang Geng, Shangwen Wang, Dezun Dong, Shanzhi Gu, Fang Peng, Weijian Ruan, and Xiangke Liao. 2022. Fine-grained code-comment semantic interaction analysis. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension (ICPC).
[27]
Sergio González, Salvador García, Javier Del Ser, Lior Rokach, and Francisco Herrera. 2020. A practical tutorial on bagging and boosting based ensembles for machine learning: Algorithms, software tools, performance study, practical perspectives and opportunities. Information Fusion 64 (2020), 205–237. DOI:
[28]
Kevin Gurney. 1997. An Introduction to Neural Networks. CRC Press.
[29]
Sonia Haiduc, Jairo Aponte, and Andrian Marcus. 2010. Supporting program comprehension with source code summarization. In 2010 ACM/IEEE 32nd International Conference on Software Engineering, Vol. 2. IEEE, 223–226.
[30]
Dorsaf Haouari, Houari Sahraoui, and Philippe Langlais. 2011. How good is your comment? A study of comments in Java programs. In 2011 International Symposium on Empirical Software Engineering and Measurement. 137–146. DOI:
[31]
Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, and Wei-Ying Ma. 2016. Dual learning for machine translation. Advances in Neural Information Processing Systems 29 (2016), 820–828.
[32]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[33]
Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC). IEEE, 200–210.
[34]
Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2020. Deep code comment generation with hybrid lexical and syntactical information. Empirical Software Engineering 25, 3 (2020), 2179–2217.
[35]
Walid M. Ibrahim, Nicolas Bettenburg, Bram Adams, and Ahmed E. Hassan. 2012. On the relationship between comment update practices and software bugs. Journal of Systems and Software 85, 10 (2012), 2293–2304.
[36]
Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing source code using a neural attention model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2073–2083.
[37]
Liangxiao Jiang, Dianhong Wang, Zhihua Cai, and Xuesong Yan. 2007. Survey of improving Naive Bayes for classification. In International Conference on Advanced Data Mining and Applications. Springer, 134–145.
[38]
Mira Kajko-Mattsson. 2005. A survey of documentation practice within corrective maintenance. Empirical Software Engineering 10, 1 (2005), 31–55.
[39]
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 30 (2017), 3146–3154.
[40]
Jessica Keyes. 2002. Software Engineering Handbook. Auerbach Publications.
[41]
Dong Jae Kim, Nikolaos Tsantalis, Tse-Hsun Chen, and Jinqiu Yang. 2021. Studying test annotation maintenance in the wild. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 62–73. DOI:
[42]
Serkan Kiranyaz, Onur Avci, Osama Abdeljaber, Turker Ince, Moncef Gabbouj, and Daniel J. Inman. 2021. 1D convolutional neural networks and applications: A survey. Mechanical Systems and Signal Processing 151 (2021), 107398.
[43]
Carsten Kolassa, Dirk Riehle, and Michel A. Salim. 2013. The empirical commit frequency distribution of open source projects. In Proceedings of the 9th International Symposium on Open Collaboration. 1–8.
[44]
F. D. C. Kraaikamp and H. L. L. Meester. 2005. A Modern Introduction to Probability and Statistics. (2005).
[45]
Adrian Kuhn, Stéphane Ducasse, and Tudor Gîrba. 2007. Semantic clustering: Identifying topics in source code. Information and Software Technology 49, 3 (2007), 230–243.
[46]
Max Kuhn. 2008. Building predictive models in R using the caret package. Journal of Statistical Software 28 (2008), 1–26.
[47]
Alexander LeClair, Sakib Haque, Lingfei Wu, and Collin McMillan. 2020. Improved code summarization via a graph neural network. In Proceedings of the 28th International Conference on Program Comprehension. 184–195.
[48]
Joseph Lev et al. 1949. The point biserial coefficient of correlation. Annals of Mathematical Statistics 20, 1 (1949), 125–126.
[49]
Yuding Liang and Kenny Zhu. 2018. Automatic generation of text descriptive comments for code blocks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
[50]
Bo Lin, Shangwen Wang, Kui Liu, Xiaoguang Mao, and Tegawendé F. Bissyandé. 2021. Automated comment update: How far are we?. In 2021 29th IEEE/ACM International Conference on Program Comprehension (ICPC). IEEE, 36–46.
[51]
Zhiyong Liu, Huanchao Chen, Xiangping Chen, Xiaonan Luo, and Fan Zhou. 2018. Automatic detection of outdated comments during code changes. In 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), Vol. 01. 154–163. DOI:
[52]
Zhongxin Liu, Xin Xia, David Lo, Meng Yan, and Shanping Li. 2021. Just-in-time obsolete comment detection and update. IEEE Transactions on Software Engineering (2021), 1–1. DOI:
[53]
Zhongxin Liu, Xin Xia, Meng Yan, and Shanping Li. 2020. Automating just-in-time comment updating. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 585–597.
[54]
Yangyang Lu, Zelong Zhao, Ge Li, and Zhi Jin. 2017. Learning to generate comments for API-based code snippets. In Software Engineering and Methodology for Emerging Domains. Springer, 3–14.
[55]
Paul W. McBurney and Collin McMillan. 2015. Automatic source code summarization of context for Java methods. IEEE Transactions on Software Engineering 42, 2 (2015), 103–119.
[56]
Patrick E. McKnight and Julius Najab. 2010. Mann-Whitney U test. The Corsini Encyclopedia of Psychology (2010), 1–1.
[57]
Gonzalo Navarro. 2001. A guided tour to approximate string matching. ACM Computing Surveys (CSUR) 33, 1 (2001), 31–88.
[58]
Yoann Padioleau, Lin Tan, and Yuanyuan Zhou. 2009. Listening to programmers– taxonomies and characteristics of comments in operating system code. In 2009 IEEE 31st International Conference on Software Engineering. 331–341. DOI:
[59]
Sheena Panthaplackel, Junyi Jessy Li, Milos Gligoric, and Raymond J. Mooney. 2020. Deep just-in-time inconsistency detection between comments and source code. arXiv preprint arXiv:2010.01625 (2020).
[60]
Sheena Panthaplackel, Pengyu Nie, Milos Gligoric, Junyi Jessy Li, and Raymond Mooney. 2020. Learning to update natural language comments based on code changes. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 1853–1868.
[61]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.
[62]
David Lorge Parnas. 2011. Precise documentation: The key to better software. In The Future of Software Engineering. Springer, 125–148.
[63]
Luca Pascarella and Alberto Bacchelli. 2017. Classifying code comments in Java open-source software systems. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 227–237.
[64]
Luca Pascarella, Magiel Bruntink, and Alberto Bacchelli. 2019. Classifying code comments in Java software systems. Empirical Software Engineering 24, 3 (2019), 1499–1537.
[65]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research 12 (2011), 2825–2830.
[66]
Martin Popel, Marketa Tomkova, Jakub Tomek, Łukasz Kaiser, Jakob Uszkoreit, Ondřej Bojar, and Zdeněk Žabokrtskỳ. 2020. Transforming machine translation: A deep learning system reaches news translation quality comparable to human professionals. Nature Communications 11, 1 (2020), 1–15.
[67]
Pooja Rani, Sebastiano Panichella, Manuel Leuenberger, Andrea Di Sorbo, and Oscar Nierstrasz. 2021. How to identify class comment types? A multi-language approach for class comment classification. Journal of Systems and Software 181 (2021), 111047.
[68]
Inderjot Kaur Ratol and Martin P. Robillard. 2017. Detecting fragile comments. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 112–122.
[69]
Paige Rodeghero, Cheng Liu, Paul W. McBurney, and Collin McMillan. 2015. An eye-tracking study of Java programmers and application to source code summarization. IEEE Transactions on Software Engineering 41, 11 (2015), 1038–1054.
[70]
Hinrich Schütze, Christopher D. Manning, and Prabhakar Raghavan. 2008. Introduction to Information Retrieval. Vol. 39. Cambridge University Press Cambridge.
[71]
Yusuke Shinyama, Yoshitaka Arahori, and Katsuhiko Gondow. 2018. Analyzing code comments to boost program comprehension. In 2018 25th Asia-Pacific Software Engineering Conference (APSEC). 325–334. DOI:
[72]
Kamilya Smagulova and Alex Pappachen James. 2019. A survey on LSTM memristive neural network architectures and applications. The European Physical Journal Special Topics 228, 10 (2019), 2313–2324.
[73]
Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori Pollock, and K. Vijay-Shanker. 2010. Towards automatically generating summary comments for Java methods. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering. 43–52.
[74]
Sean Stapleton, Yashmeet Gambhir, Alexander LeClair, Zachary Eberhart, Westley Weimer, Kevin Leach, and Yu Huang. 2020. A human study of comprehension and code summarization. In Proceedings of the 28th International Conference on Program Comprehension. 2–13.
[75]
Daniela Steidl, Benjamin Hummel, and Elmar Juergens. 2013. Quality analysis of source code comments. In 2013 21st International Conference on Program Comprehension (ICPC). 83–92. DOI:
[76]
Nataliia Stulova, Arianna Blasi, Alessandra Gorla, and Oscar Nierstrasz. 2020. Towards detecting inconsistent comments in Java source code automatically. In 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 65–69.
[77]
Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. /* iComment: Bugs or bad comments?*. In Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles. 145–158.
[78]
Lin Tan, Ding Yuan, and Yuanyuan Zhou. 2007. Hotcomments: How to make program comments more useful?. In HotOS.
[79]
Lin Tan, Yuanyuan Zhou, and Yoann Padioleau. 2011. aComment: Mining annotations from comments and code to detect interrupt related concurrency bugs. In 2011 33rd International Conference on Software Engineering (ICSE). IEEE, 11–20.
[80]
Shin Hwei Tan, Darko Marinov, Lin Tan, and Gary T. Leavens. 2012. @tComment: Testing Javadoc comments to detect comment-code inconsistencies. In 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation. 260–269. DOI:
[81]
Chakkrit Tantithamthavorn, Ahmed E. Hassan, and Kenichi Matsumoto. 2020. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Transactions on Software Engineering 46, 11 (2020), 1200–1219.
[82]
Betty Van Aken, Julian Risch, Ralf Krestel, and Alexander Löser. 2018. Challenges for toxic comment classification: An in-depth error analysis. arXiv preprint arXiv:1809.07572 (2018).
[83]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
[84]
Yao Wan, Zhou Zhao, Min Yang, Guandong Xu, Haochao Ying, Jian Wu, and Philip S. Yu. 2018. Improving automatic source code summarization via deep reinforcement learning. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 397–407.
[85]
Haoye Wang, Xin Xia, David Lo, Qiang He, Xinyu Wang, and John Grundy. 2021. Context-aware retrieval-based deep commit message generation. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 4 (2021), 1–30.
[86]
Bolin Wei, Yongmin Li, Ge Li, Xin Xia, and Zhi Jin. 2020. Retrieve and refine: Exemplar-based neural comment generation. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 349–360.
[87]
Fengcai Wen, Csaba Nagy, Gabriele Bavota, and Michele Lanza. 2019. A large-scale empirical study on code-comment inconsistencies. In 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC). IEEE, 53–64.
[88]
Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in Statistics. Springer, 196–202.
[89]
Edmund Wong, Taiyue Liu, and Lin Tan. 2015. CloCom: Mining existing source code for automatic comment generation. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 380–389.
[90]
Fei Wu, Xiao-Yuan Jing, Shiguang Shan, Wangmeng Zuo, and Jing-Yu Yang. 2017. Multiset feature learning for highly imbalanced data classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.
[91]
Xin Xia, Lingfeng Bao, David Lo, Zhenchang Xing, Ahmed E. Hassan, and Shanping Li. 2017. Measuring program comprehension: A large-scale field study with professionals. IEEE Transactions on Software Engineering 44, 10 (2017), 951–976.
[92]
Zhen Yang. 2022. yz1019117968/TOSEM-22-CBS: Source Code for “On the Significance of Category Prediction for Code-Comment Synchronization”. https://github.com/yz1019117968/TOSEM-22-CBS. (Accessed on 05/04/2022).
[93]
Zhen Yang, Jacky Keung, Xiao Yu, Xiaodong Gu, Zhengyuan Wei, Xiaoxue Ma, and Miao Zhang. 2021. A multi-modal transformer-based code summarization approach for smart contracts. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC). 1–12. DOI:
[94]
Tong Yu and Hong Zhu. 2020. Hyper-parameter optimization: A review of algorithms and applications. arXiv preprint arXiv:2003.05689 (2020).
[95]
Juan Zhai, Xiangzhe Xu, Yu Shi, Guanhong Tao, Minxue Pan, Shiqing Ma, Lei Xu, Weifeng Zhang, Lin Tan, and Xiangyu Zhang. 2020. CPC: Automatically classifying and propagating natural language comments via program analysis. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1359–1371.
[96]
Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, and Xudong Liu. 2020. Retrieval-based neural source code summarization. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 1385–1397.
[97]
Junli Zhang, Xuewei Zhao, Ming Xian, Chuan Dong, and Shaomin Shuang. 2018. Folic acid-conjugated green luminescent carbon dots as a nanoprobe for identifying folate receptor-positive cancer cells. Talanta 183 (2018), 39–47. DOI:
[98]
Yu Zhou, Ruihang Gu, Taolue Chen, Zhiqiu Huang, Sebastiano Panichella, and Harald Gall. 2017. Analyzing APIs documentation and code to detect directive defects. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 27–37.
[99]
Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang Zhou, Houqiang Li, and Tie-Yan Liu. 2020. Incorporating BERT into neural machine translation. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=Hyl7ygStwB
[100]
Qihao Zhu, Zeyu Sun, Yuan-an Xiao, Wenjie Zhang, Kang Yuan, Yingfei Xiong, and Lu Zhang. 2021. A syntax-guided edit decoder for neural program repair. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 341–353.

Cited By

View all
  • (2025)Deep Orthogonal Fusion Smoothing Hashing for Remote Sensing Image RetrievalIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2024.351125618(2093-2107)Online publication date: 2025
  • (2024)Exploring and Unleashing the Power of Large Language Models in Automated Code TranslationProceedings of the ACM on Software Engineering10.1145/36607781:FSE(1585-1608)Online publication date: 12-Jul-2024
  • (2024)Automated Commit Message Generation With Large Language Models: An Empirical Study and BeyondIEEE Transactions on Software Engineering10.1109/TSE.2024.347831750:12(3208-3224)Online publication date: 1-Dec-2024
  • Show More Cited By

Index Terms

  1. On the Significance of Category Prediction for Code-Comment Synchronization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Software Engineering and Methodology
    ACM Transactions on Software Engineering and Methodology  Volume 32, Issue 2
    March 2023
    946 pages
    ISSN:1049-331X
    EISSN:1557-7392
    DOI:10.1145/3586025
    • Editor:
    • Mauro Pezzè
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 March 2023
    Online AM: 24 May 2022
    Accepted: 27 April 2022
    Revised: 24 March 2022
    Received: 07 December 2021
    Published in TOSEM Volume 32, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Code-comment synchronization
    2. category classification
    3. deep learning
    4. heuristic rules

    Qualifiers

    • Research-article

    Funding Sources

    • General Research Fund (GRF)
    • Research Grants Council of Hong Kong
    • City University of Hong Kong
    • National Natural Science Foundation of China
    • Singapore National Research Foundation and National University of Singapore
    • National Satellite of Excellence in Trustworthy Software Systems (NSOE-TSS)
    • Trustworthy Software Systems Core Technologies Grant (TSSCTG)
    • The Natural Science Foundation of Chongqing City

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)318
    • Downloads (Last 6 weeks)27
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Deep Orthogonal Fusion Smoothing Hashing for Remote Sensing Image RetrievalIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2024.351125618(2093-2107)Online publication date: 2025
    • (2024)Exploring and Unleashing the Power of Large Language Models in Automated Code TranslationProceedings of the ACM on Software Engineering10.1145/36607781:FSE(1585-1608)Online publication date: 12-Jul-2024
    • (2024)Automated Commit Message Generation With Large Language Models: An Empirical Study and BeyondIEEE Transactions on Software Engineering10.1109/TSE.2024.347831750:12(3208-3224)Online publication date: 1-Dec-2024
    • (2024)Combining Deep Learning and Expert Rules for Smart Contract Vulnerability Detection2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831120(1598-1603)Online publication date: 6-Oct-2024
    • (2024)Delving into Parameter-Efficient Fine-Tuning in Code Change Learning: An Empirical Study2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00055(465-476)Online publication date: 12-Mar-2024
    • (2024)Practitioners' Expectations on Code Smell Detection2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00175(1324-1333)Online publication date: 2-Jul-2024
    • (2024)Data preparation for Deep Learning based Code Smell Detection: A systematic literature reviewJournal of Systems and Software10.1016/j.jss.2024.112131216(112131)Online publication date: Oct-2024
    • (2024)A vulnerability detection framework with enhanced graph feature learningJournal of Systems and Software10.1016/j.jss.2024.112118216(112118)Online publication date: Oct-2024
    • (2024)A vulnerability detection framework by focusing on critical execution pathsInformation and Software Technology10.1016/j.infsof.2024.107517174(107517)Online publication date: Oct-2024
    • (2024)Improving domain-specific neural code generation with few-shot meta-learningInformation and Software Technology10.1016/j.infsof.2023.107365166:COnline publication date: 4-Mar-2024
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media