Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

What Makes a Good TODO Comment?

Published: 28 June 2024 Publication History

Abstract

Software development is a collaborative process that involves various interactions among individuals and teams. TODO comments in source code play a critical role in managing and coordinating diverse tasks during this process. However, this study finds that a large proportion of open-source project TODO comments are left unresolved or take a long time to be resolved. About 46.7% of TODO comments in open-source repositories are of low-quality (e.g., TODOs that are ambiguous, lack information, or are useless to developers). This highlights the need for better TODO practices. In this study, we investigate four aspects regarding the quality of TODO comments in open-source projects: (1) the prevalence of low-quality TODO comments; (2) the key characteristics of high-quality TODO comments; (3) how are TODO comments of different quality managed in practice; and (4) the feasibility of automatically assessing TODO comment quality. Examining 2,863 TODO comments from Top100 GitHub Java repositories, we propose criteria to identify high-quality TODO comments and provide insights into their optimal composition. We discuss the lifecycle of TODO comments with varying quality. To assist developers, we construct deep learning-based methods that show promising performance in identifying the quality of TODO comments, potentially enhancing development efficiency and code quality.

References

[1]
Zenodo. 2023. What Makes a Good TODO Comment. Retrieved from https://zenodo.org/records/10878002
[2]
Emery D. Berger, Celeste Hollenbeck, Petr Maj, Olga Vitek, and Jan Vitek. 2019. On the impact of programming languages on code quality: A reproduction study. ACM Trans. Program. Lang. Syst. 41, 4 (2019), 1–24.
[3]
Leo Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5–32.
[4]
Nathan Cassee, Fiorella Zampetti, Nicole Novielli, Alexander Serebrenik, and Massimiliano Di Penta. 2022. Self-admitted technical debt and comments’ polarity: An empirical study. Empir. Softw. Eng. 27, 6 (2022), 139.
[5]
Yitian Chai, Hongyu Zhang, Beijun Shen, and Xiaodong Gu. 2022. Cross-domain deep code search with meta learning. In Proceedings of the 44th International Conference on Software Engineering. 487–498.
[6]
Jie-Cherng Chen and Sun-Jen Huang. 2009. An empirical analysis of the impact of software development problem factors on software maintainability. J. Syst. Softw. 82, 6 (2009), 981–992.
[7]
Yahui Chen. 2015. Convolutional neural network for sentence classification. Master’s thesis. University of Waterloo.
[8]
Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Edu. Psychol. Measure. 20, 1 (1960), 37–46.
[9]
Sergio Cozzetti B. de Souza, Nicolas Anquetil, and Káthia M. de Oliveira. 2005. A study of the documentation essential to software maintenance. In Proceedings of the 23rd Annual International Conference on Design of Communication: Documenting and Designing for Pervasive Information. 68–75.
[10]
Uri Dekel and James D. Herbsleb. 2009. Reading the documentation of invoked API functions in program comprehension. In Proceedings of the IEEE 17th International Conference on Program Comprehension. IEEE, 168–177.
[11]
Luca Di Grazia and Michael Pradel. 2023. Code search: A survey of techniques for finding code. Comput. Surveys 55, 11 (2023), 1–31.
[12]
Shima Esfandiari and Ashkan Sami. 2023. An exploratory study of the relationship between SATD and other software development activities. In Proceedings of the 13th International Conference on Computer and Knowledge Engineering (ICCKE’23). IEEE, 096–101.
[13]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, et al. 2020. Codebert: A pre-trained model for programming and natural languages. Retrieved from https://arXiv:2002.08155
[14]
Joseph L. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 5 (1971), 378.
[15]
Jerome H. Friedman. 2001. Greedy function approximation: A gradient boosting machine. The Annals of Statistics 29, 5 (2001), 1189–1232.
[16]
Gianmarco Fucci, Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2020. Who (self) admits technical debt? In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 672–676.
[17]
Shuzheng Gao, Cuiyun Gao, Yulan He, Jichuan Zeng, Lunyiu Nie, Xin Xia, and Michael Lyu. 2023. Code structure-guided transformer for source code summarization. ACM Trans. Softw. Eng. Methodol. 32, 1 (2023), 1–32.
[18]
Zhipeng Gao, Xin Xia, John Grundy, David Lo, and Yuan-Fang Li. 2020. Generating question titles for stack overflow from mined code snippets. ACM Trans. Softw. Eng. Methodol. 29, 4 (2020), 1–37.
[19]
Zhipeng Gao, Xin Xia, David Lo, and John Grundy. 2020. Technical q8a site answer recommendation via question boosting. ACM Trans. Softw. Eng. Methodol. 30, 1 (2020), 1–34.
[20]
Zhipeng Gao, Xin Xia, David Lo, John Grundy, and Thomas Zimmermann. 2021. Automating the removal of obsolete TODO comments. Retrieved from https://arXiv:2108.05846
[21]
GitHub. 2023. The state of open source software. Retrieved from https://octoverse.github.com/
[22]
Dorsaf Haouari, Houari Sahraoui, and Philippe Langlais. 2011. How good is your comment? A study of comments in java programs. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement. IEEE, 137–146.
[23]
Sture Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 2 (1979), 65–70.
[24]
Qiao Huang, Emad Shihab, Xin Xia, David Lo, and Shanping Li. 2018. Identifying self-admitted technical debt in open source projects using text mining. Empir. Softw. Eng. 23, 1 (2018), 418–451.
[25]
JasonEtco. 2022. TODO Bot. Retrieved from https://github.com/apps/todo
[26]
Yasutaka Kamei, Everton da S. Maldonado, Emad Shihab, and Naoyasu Ubayashi. 2016. Using analytics to quantify interest of self-admitted technical debt. In Proceedings of the International Workshop on Quantitative Approaches to Software Quality and International Workshop on Technical Debt Analytics at the Asia-Pacific Software Engineering Conference (QuASoQ/TDA@ APSEC’16). 68–71.
[27]
Ninus Khamis, René Witte, and Juergen Rilling. 2010. Automatic quality assessment of source code comments: The JavadocMiner. In Proceedings of the 15th International Conference on Applications of Natural Language to Information Systems (NLDB’10). Springer, 68–79.
[28]
Miikka Kuutila, Mika Mäntylä, Umar Farooq, and Maelick Claes. 2020. Time pressure in software engineering: A systematic review. Info. Softw. Technol. 121 (2020), 106257.
[29]
Zengyang Li, Paris Avgeriou, and Peng Liang. 2015. A systematic mapping study on technical debt and its management. J. Syst. Softw. 101 (2015), 193–220.
[30]
Zhong Li, Minxue Pan, Yu Pei, Tian Zhang, Linzhang Wang, and Xuandong Li. 2024. Empirically revisiting and enhancing automatic classification of bug and non-bug issues. Front. Comput. Sci. 18, 5 (2024), 1–20.
[31]
Jinfeng Lin, Yalin Liu, Qingkai Zeng, Meng Jiang, and Jane Cleland-Huang. 2021. Traceability transformed: Generating more accurate links with pre-trained BERT models. In Proceedings of the IEEE/ACM 43rd International Conference on Software Engineering (ICSE’21). IEEE, 324–335.
[32]
Zhiyong Liu, Huanchao Chen, Xiangping Chen, Xiaonan Luo, and Fan Zhou. 2018. Automatic detection of outdated comments during code changes. In Proceedings of the IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC’18), Vol. 1. IEEE, 154–163.
[33]
Zhongxin Liu, Xin Xia, Meng Yan, and Shanping Li. 2020. Automating just-in-time comment updating. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 585–597.
[34]
Innobuilt Software LLC. 2019. All your TODO comments in one place. Retrieved from https://imdone.io
[35]
Edward Loper and Steven Bird. 2002. Nltk: The natural language toolkit. Retrieved from https://cs/0205028
[36]
Ilya Loshchilov and Frank Hutter. 2017. Fixing weight decay regularization in adam. ArXiv abs/1711.05101 (2017).
[37]
Everton da S. Maldonado, Rabe Abdalkareem, Emad Shihab, and Alexander Serebrenik. 2017. An empirical study on the removal of self-admitted technical debt. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’17). IEEE, 238–248.
[38]
Everton da S. Maldonado and Emad Shihab. 2015. Detecting and quantifying different types of self-admitted technical debt. In Proceedings of the IEEE 7th International Workshop on Managing Technical Debt (MTD’15). IEEE, 9–15.
[39]
André N. Meyer, Earl T. Barr, Christian Bird, and Thomas Zimmermann. 2019. Today was a good day: The daily life of software developers. IEEE Trans. Softw. Eng. 47, 5 (2019), 863–880.
[40]
Hamid Mohayeji, Felipe Ebert, Eric Arts, Eleni Constantinou, and Alexander Serebrenik. 2022. On the adoption of a TODO bot on GitHub: A preliminary study. In Proceedings of the 4th International Workshop on Bots in Software Engineering. 23–27.
[41]
Pengyu Nie, Junyi Jessy Li, Sarfraz Khurshid, Raymond Mooney, and Milos Gligoric. 2018. Natural language processing and program analysis for supporting todo comments as software evolves. In Proceedings of the Workshops at the 32nd AAAI Conference on Artificial Intelligence.
[42]
Pengyu Nie, Rishabh Rai, Junyi Jessy Li, Sarfraz Khurshid, Raymond J. Mooney, and Milos Gligoric. 2019. A framework for writing trigger-action todo comments in executable format. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 385–396.
[43]
Robert L. Nord, Ipek Ozkaya, Philippe Kruchten, and Marco Gonzalez-Rojas. 2012. In search of a metric for managing architectural technical debt. In Proceedings of the Joint Working IEEE/IFIP Conference on Software Architecture and European Conference on Software Architecture. IEEE, 91–100.
[44]
Leif E. Peterson. 2009. K-nearest neighbor. Scholarpedia 4, 2 (2009), 1883.
[45]
Reinhold Plösch, Andreas Dautovic, and Matthias Saft. 2014. The value of software documentation quality. In Proceedings of the 14th International Conference on Quality Software. IEEE, 333–342.
[46]
Aniket Potdar and Emad Shihab. 2014. An exploratory study on self-admitted technical debt. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. IEEE, 91–100.
[47]
Sawan Rai, Ramesh Chandra Belwal, and Atul Gupta. 2022. A review on source code documentation. ACM Trans. Intell. Syst. Technol. 13, 5 (2022), 1–44.
[48]
Baishakhi Ray, Daryl Posnett, Vladimir Filkov, and Premkumar Devanbu. 2014. A large scale study of programming languages and code quality in GitHub. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 155–165.
[49]
Xiaoxue Ren, Zhenchang Xing, Xin Xia, David Lo, Xinyu Wang, and John Grundy. 2019. Neural network-based detection of self-admitted technical debt: From performance to explainability. ACM Trans. Softw. Eng. Methodol. 28, 3 (2019), 1–45.
[50]
The replicate package. 2023. scikit-learn. Retrieved from https://scikit-learn.org/stable/
[51]
Martin Riedmiller and A. Lernen. 2014. Multi layer perceptron. Machine Learning Lab Special Lecture, University of Freiburg (2014), 7–24.
[52]
Barbara Russo, Matteo Camilli, and Moritz Mock. 2022. WeakSATD: Detecting weak self-admitted technical debt. In Proceedings of the 19th International Conference on Mining Software Repositories. 448–453.
[53]
Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 11 (1997), 2673–2681.
[54]
Hinrich Schütze, Christopher D. Manning, and Prabhakar Raghavan. 2008. Introduction to Information Retrieval. Vol. 39. Cambridge University Press, Cambridge, UK.
[55]
Justin Searls. 2019. todo_or_die. Retrieved from https://github.com/searls/todo_or_die
[56]
Giriprasad Sridhara. 2016. Automatically detecting the up-to-date status of ToDo comments in Java programs. In Proceedings of the 9th India Software Engineering Conference. 16–25.
[57]
Daniela Steidl, Benjamin Hummel, and Elmar Juergens. 2013. Quality analysis of source code comments. In Proceedings of the 21st International Conference on Program Comprehension (ICPC’13). Ieee, 83–92.
[58]
Margaret-Anne Storey, Jody Ryall, R Ian Bull, Del Myers, and Janice Singer. 2008. Todo or to bug: Exploring how task annotations play a role in the work practices of software developers. In Proceedings of the 30th International Conference on Software Engineering. 251–260.
[59]
Adam Svensson. 2015. Reducing outdated and inconsistent code comments during software development: The comment validator program. https://api.semanticscholar.org/CorpusID:61830050
[60]
Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. iComment: Bugs or bad comments? In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles. 145–158.
[61]
Yingchen Tian, Yuxia Zhang, Klaas-Jan Stol, Lin Jiang, and Hui Liu. 2022. What makes a good commit message? In Proceedings of the 44th International Conference on Software Engineering. 2389–2401.
[62]
Juliana Tolles and William J. Meurer. 2016. Logistic regression: Relating patient characteristics to outcomes. JAMA 316, 5 (2016), 533–534.
[63]
Sultan Wehaibi, Emad Shihab, and Latifa Guerrouj. 2016. Examining the impact of self-admitted technical debt on software quality. In Proceedings of the IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER’16), Vol. 1. IEEE, 179–188.
[64]
Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in Statistics. Springer, 196–202.
[65]
Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in Software Engineering. Springer Science & Business Media.
[66]
Laerte Xavier, Fabio Ferreira, Rodrigo Brito, and Marco Tulio Valente. 2020. Beyond the code: Mining self-admitted technical debt in issue tracker systems. In Proceedings of the 17th International Conference on Mining Software Repositories. 137–146.
[67]
Bowen Xu, Thong Hoang, Abhishek Sharma, Chengran Yang, Xin Xia, and David Lo. 2021. Post2vec: Learning distributed representations of Stack Overflow posts. IEEE Trans. Softw. Eng. 48, 9 (2021), 3423–3441.
[68]
Bowen Xu, Zhenchang Xing, Xin Xia, and David Lo. 2017. AnswerBot: Automated generation of answer summary to developers’ technical questions. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). IEEE, 706–716.
[69]
Meng Yan, Xin Xia, Yuanrui Fan, Ahmed E. Hassan, David Lo, and Shanping Li. 2020. Just-in-time defect identification and localization: A two-phase framework. IEEE Transactions on Software Engineering 48, 1 (2020), 82–101.
[70]
Bai Yang, Zhang Liping, and Zhao Fengrong. 2019. A survey on research of code comment. In Proceedings of the 3rd International Conference on Management Engineering, Software Engineering and Service Sciences. 45–51.
[71]
Jerin Yasmin, Mohammad Sadegh Sheikhaei, and Yuan Tian. 2022. A first look at duplicate and near-duplicate self-admitted technical debt comments. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension. 614–618.
[72]
Annie T. T. Ying, James L. Wright, and Steven Abrams. 2005. Source code that talks: An exploration of Eclipse task comments and their implication to repository mining. ACM SIGSOFT Softw. Eng. Notes 30, 4 (2005), 1–5.
[73]
Sarim Zafar, Muhammad Zubair Malik, and Gursimran Singh Walia. 2019. Towards standardizing and improving classification of bug-fix commits. In Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’19). IEEE, 1–6.
[74]
Fiorella Zampetti, Gianmarco Fucci, Alexander Serebrenik, and Massimiliano Di Penta. 2021. Self-admitted technical debt practices: A comparison between industry and open-source. Empir. Softw. Eng. 26 (2021), 1–32.
[75]
Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2018. Was self-admitted technical debt removal a real removal? an in-depth perspective. In Proceedings of the 15th International Conference on Mining Software Repositories. 526–536.
[76]
Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2020. Automatically learning patterns for self-admitted technical debt removal. In Proceedings of the IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER’20). IEEE, 355–366.
[77]
Nico Zazworka, Rodrigo O. Spínola, Antonio Vetro’, Forrest Shull, and Carolyn Seaman. 2013. A case study on effectively identifying technical debt. In Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering. 42–47.
[78]
Xin Zhang, Yang Chen, Yongfeng Gu, Weiqin Zou, Xiaoyuan Xie, Xiangyang Jia, and Jifeng Xuan. 2018. How do multiple pull requests change the same code: A study of competing pull requests in GitHub. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’18). IEEE, 228–239.
[79]
Ye Zhang and Byron Wallace. 2015. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. Retrieved from https://arXiv:1510.03820
[80]
Jiayuan Zhou, Michael Pacheco, Zhiyuan Wan, Xin Xia, David Lo, Yuan Wang, and Ahmed E. Hassan. 2021. Finding a needle in a haystack: Automated mining of silent vulnerability fixes. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, 705–716.
[81]
Yu Zhou, Ruihang Gu, Taolue Chen, Zhiqiu Huang, Sebastiano Panichella, and Harald Gall. 2017. Analyzing APIs documentation and code to detect directive defects. In Proceedings of the IEEE/ACM 39th International Conference on Software Engineering (ICSE’17). IEEE, 27–37.

Cited By

View all
  • (2024)Just-In-Time TODO-Missed Commits DetectionIEEE Transactions on Software Engineering10.1109/TSE.2024.340500550:11(2732-2752)Online publication date: 1-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 33, Issue 6
July 2024
951 pages
EISSN:1557-7392
DOI:10.1145/3613693
  • Editor:
  • Mauro Pezzé
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2024
Online AM: 13 May 2024
Accepted: 03 May 2024
Revised: 20 March 2024
Received: 21 November 2023
Published in TOSEM Volume 33, Issue 6

Check for updates

Author Tags

  1. Documentation
  2. comment quality
  3. comment lifecycle

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Zhejiang Provincial Natural Science Foundation of China
  • ARC Laureate Fellowship
  • Zhejiang Province “JianBingLingYan+X” Research and Development Plan
  • Joint Funds of the Zhejiang Provincial Natural Science Foundation of China
  • Starry Night Science Fund of Zhejiang University Shanghai Institute for Advanced Study
  • Shanghai Sailing Program
  • Zhejiang Provincial Engineering Research Center for Real-time SmartTech in Urban Security Governance

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)290
  • Downloads (Last 6 weeks)25
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Just-In-Time TODO-Missed Commits DetectionIEEE Transactions on Software Engineering10.1109/TSE.2024.340500550:11(2732-2752)Online publication date: 1-Nov-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media