Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3540250.3549088acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

23 shades of self-admitted technical debt: an empirical study on machine learning software

Published: 09 November 2022 Publication History

Abstract

In software development, the term “technical debt” (TD) is used to characterize short-term solutions and workarounds implemented in source code which may incur a long-term cost. Technical debt has a variety of forms and can thus affect multiple qualities of software including but not limited to its legibility, performance, and structure. In this paper, we have conducted a comprehensive study on the technical debts in machine learning (ML) based software. TD can appear differently in ML software by infecting the data that ML models are trained on, thus affecting the functional behavior of ML systems. The growing inclusion of ML components in modern software systems have introduced a new set of TDs. Does ML software have similar TDs to traditional software? If not, what are the new types of ML specific TDs? Which ML pipeline stages do these debts appear? Do these debts differ in ML tools and applications and when they get removed? Currently, we do not know the state of the ML TDs in the wild. To address these questions, we mined 68,820 self-admitted technical debts (SATD) from all the revisions of a curated dataset consisting of 2,641 popular ML repositories from GitHub, along with their introduction and removal. By applying an open-coding scheme and following upon prior works, we provide a comprehensive taxonomy of ML SATDs. Our study analyzes ML SATD type organizations, their frequencies within stages of ML software, the differences between ML SATDs in applications and tools, and quantifies the removal of ML SATDs. The findings discovered suggest implications for ML developers and researchers to create maintainable ML systems.

References

[1]
Reem Alfayez, Wesam Alwehaibi, Robert Winn, Elaine Venson, and Barry Boehm. 2020. A Systematic Literature Review of Technical Debt Prioritization. In Proceedings of the 3rd International Conference on Technical Debt (TechDebt ’20). Association for Computing Machinery, New York, NY, USA. 1–10. isbn:9781450379601 https://doi.org/10.1145/3387906.3388630
[2]
Nicolli S.R. Alves, Leilane F. Ribeiro, Vivyane Caires, Thiago S. Mendes, and Rodrigo O. Spínola. 2014. Towards an Ontology of Terms on Technical Debt. In 2014 Sixth International Workshop on Managing Technical Debt. 1–7. https://doi.org/10.1109/MTD.2014.9
[3]
Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 291–300. https://doi.org/10.1109/ICSE-SEIP.2019.00042
[4]
Gabriele Bavota and Barbara Russo. 2016. A large-scale empirical study on self-admitted technical debt. In International Conference on Mining Software Repositories. ACM, 315–326.
[5]
Houssem Ben Braiek, Foutse Khomh, and Bram Adams. 2018. The Open-Closed Principle of Modern Machine Learning Frameworks. In 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR). 353–363.
[6]
Christian Bird, Peter C. Rigby, Earl T. Barr, David J. Hamilton, Daniel M. German, and Prem Devanbu. 2009. The promises and perils of mining git. In 2009 6th IEEE International Working Conference on Mining Software Repositories. 1–10. https://doi.org/10.1109/MSR.2009.5069475
[7]
Sumon Biswas, Md Johirul Islam, Yijia Huang, and Hridesh Rajan. 2019. Boa Meets Python: A Boa Dataset of Data Science Software in Python Language. In Proceedings of the 16th International Conference on Mining Software Repositories (MSR ’19). IEEE Press, 577–581. https://doi.org/10.1109/MSR.2019.00086
[8]
Sumon Biswas and Hridesh Rajan. 2020. Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 642–653. isbn:9781450370431 https://doi.org/10.1145/3368089.3409704
[9]
Sumon Biswas and Hridesh Rajan. 2021. Fair Preprocessing: Towards Understanding Compositional Fairness of Data Transformers in Machine Learning Pipeline. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021). Association for Computing Machinery, New York, NY, USA. 981–993. isbn:9781450385626 https://doi.org/10.1145/3468264.3468536
[10]
Sumon Biswas, Mohammad Wardat, and Hridesh Rajan. 2022. The Art and Practice of Data Science Pipelines: A Comprehensive Study of Data Science Pipelines In Theory, In-The-Small, and In-The-Large. In ICSE’22: The 44th International Conference on Software Engineering.
[11]
Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, and D. Sculley. 2017. The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction. In Proceedings of IEEE Big Data.
[12]
N. Brown, Y. Cai, Y. Guo, R. Kazman, M. Kim, P. Kruchten, E. Lim, A. MacCormack, R. Nord, I. Ozkaya, R. Sangwan, C. Seaman, K. Sullivan, and N. Zazworka. 2010. Managing Technical Debt in Software-reliant Systems. In FSE/SDP Workshop on Future of Software Engineering Research. ACM, 47–52.
[13]
Malinda Dilhara, Ameya Ketkar, and Danny Dig. 2021. Understanding Software-2.0: A Study of Machine Learning Library Usage and Evolution. ACM Trans. Softw. Eng. Methodol., 30, 4 (2021), Article 55, 42 pages.
[14]
Malinda Dilhara, Ameya Ketkar, Nikhith Sannidhi, and Danny Dig. 2022. Discovering Repetitive Code Changes in Python ML Systems. In 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). 736–748. https://doi.org/10.1145/3510003.3510225
[15]
Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen. 2013. Boa: A language and infrastructure for analyzing ultra-large-scale software repositories. In 2013 35th International Conference on Software Engineering (ICSE). 422–431. https://doi.org/10.1109/ICSE.2013.6606588
[16]
Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen. 2015. Boa: Ultra-Large-Scale Software Repository and Source-Code Mining. ACM Trans. Softw. Eng. Methodol., 25, 1 (2015), Article 7, 7:1–7:34 pages.
[17]
Robert Dyer, Hridesh Rajan, Hoan Anh Nguyen, and Tien N. Nguyen. 2014. Mining Billions of AST Nodes to Study Actual and Potential Usage of Java Language Features. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). 779–790.
[18]
Robert Dyer, Hridesh Rajan, and Tien N. Nguyen. 2013. Declarative Visitors to Ease Fine-grained Source Code Mining with Full History on Billions of AST Nodes. In Proceedings of the 12th International Conference on Generative Programming: Concepts & Experiences (GPCE). 23–32.
[19]
Neil A. Ernst, Stephany Bellomo, Ipek Ozkaya, Robert L. Nord, and Ian Gorton. 2015. Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). 50–60.
[20]
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In ACM/IEEE International Conference on Automated Software Engineering, ASE ’14, Vasteras, Sweden - September 15 - 19, 2014. 313–324. https://doi.org/10.1145/2642937.2642982
[21]
Joseph L Fleiss, Bruce Levin, and Myunghee Cho Paik. 1981. The measurement of interrater agreement. Statistical methods for rates and proportions, 2, 212-236 (1981), 22–23.
[22]
Samuel W. Flint, Jigyasa Chauhan, and Robert Dyer. 2021. Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). 85–96. https://doi.org/10.1109/MSR52588.2021.00022
[23]
H. Foidl, M. Felderer, and R. Ramler. 2022. Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems. In 2022 IEEE/ACM 1st International Conference on AI Engineering – Software Engineering for AI (CAIN). IEEE Computer Society, Los Alamitos, CA, USA. 229–239. https://doi.ieeecomputersociety.org/
[24]
F. A. Fontana, V. Ferme, and S. Spinelli. 2012. Investigating the impact of code smells debt on quality code evaluation. In International Workshop on Managing Technical Debt. IEEE, 15–22.
[25]
Gianmarco Fucci, Nathan Cassee, Fiorella Zampetti, Nicole Novielli, Alexander Serebrenik, and Massimiliano Di Penta. 2021. Waiting around or job half-done? Sentiment in self-admitted technical debt. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). 403–414. https://doi.org/10.1109/MSR52588.2021.00052
[26]
Danielle Gonzalez, T. Zimmermann, and N. Nagappan. 2020. The State of the ML-universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHub. Proceedings of the 17th International Conference on Mining Software Repositories.
[27]
Qiao Huang, Emad Shihab, Xin Xia, David Lo, and Shanping Li. 2018. Identifying Self-Admitted Technical Debt in Open Source Projects Using Text Mining. Empirical Softw. Engg., 23, 1 (2018), feb, 418–451. issn:1382-3256 https://doi.org/10.1007/s10664-017-9522-4
[28]
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of Real Faults in Deep Learning Systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). 1110–1121.
[29]
Nick Hynes, D. Sculley, and Michael Terry. 2017. The Data Linter: Lightweight Automated Sanity Checking for ML Data Sets. http://learningsys.org/nips17/assets/papers/paper_19.pdf
[30]
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA. 510–520. isbn:9781450355728 https://doi.org/10.1145/3338906.3338955
[31]
Md Johirul Islam, Rangeet Pan, Giang Nguyen, and Hridesh Rajan. 2020. Repairing Deep Neural Networks: Fix Patterns and Challenges. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, New York, NY, USA. 1135–1146. isbn:9781450371216 https://doi.org/10.1145/3377811.3380378
[32]
Philippe Kruchten, Robert L. Nord, Ipek Ozkaya, and Davide Falessi. 2013. Technical debt: towards a Crisper Definition. Report on the 4th International Workshop on Managing Technical Debt. ACM SIGSOFT Software Engineering Notes, 38, 5 (2013), 51–54.
[33]
Valentina Lenarduzzi, Terese Besker, Davide Taibi, Antonio Martini, and Francesca Arcelli Fontana. 2021. A systematic literature review on Technical Debt prioritization: Strategies, processes, factors, and tools. Journal of Systems and Software, 171 (2021), 110827. issn:0164-1212 https://doi.org/10.1016/j.jss.2020.110827
[34]
Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, and Shanping Li. 2020. Is Using Deep Learning Frameworks Free? Characterizing Technical Debt in Deep Learning Frameworks. ICSE-SEIS ’20. 1–10.
[35]
Rungroj Maipradit, Christoph Treude, Hideaki Hata, and Kenichi Matsumoto. 2019. Wait For It: Identifying "On-Hold" Self-Admitted Technical Debt. CoRR, abs/1901.09511 (2019), arXiv:1901.09511. arxiv:1901.09511
[36]
Everton Maldonado, Emad Shihab, and Nikolaos Tsantalis. 2017. Using Natural Language Processing to Automatically Detect Self-Admitted Technical Debt. IEEE Transactions on Software Engineering, to appear.
[37]
Everton Da S. Maldonado, Rabe Abdalkareem, Emad Shihab, and Alexander Serebrenik. 2017. An Empirical Study on the Removal of Self-Admitted Technical Debt. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). 238–248. https://doi.org/10.1109/ICSME.2017.8
[38]
Giang Nguyen, Md Johirul Islam, Rangeet Pan, and Hridesh Rajan. 2022. Manas: Mining Software Repositories to Assist AutoML. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA. 1368–1380. isbn:9781450392211 https://doi.org/10.1145/3510003.3510052
[39]
Rangeet Pan and Hridesh Rajan. 2020. On Decomposing a Deep Neural Network into Modules. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 889–900. isbn:9781450370431 https://doi.org/10.1145/3368089.3409668
[40]
Rangeet Pan and Hridesh Rajan. 2022. Decomposing Convolutional Neural Networks into Reusable and Replaceable Modules. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA. 524–535. isbn:9781450392211 https://doi.org/10.1145/3510003.3510051
[41]
João Felipe Pimentel, Leonardo Murta, Vanessa Braganholo, and Juliana Freire. 2019. A Large-Scale Study About Quality and Reproducibility of Jupyter Notebooks. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 507–517. https://doi.org/10.1109/MSR.2019.00077
[42]
Aniket Potdar and Emad Shihab. 2014. An Exploratory Study on Self-Admitted Technical Debt. In International Conference on Software Maintenance and Evolution. IEEE Computer Society, 91–100.
[43]
Xiaoxue Ren, Zhenchang Xing, Xin Xia, David Lo, Xinyu Wang, and John Grundy. 2019. Neural Network-Based Detection of Self-Admitted Technical Debt: From Performance to Explainability. ACM Trans. Softw. Eng. Methodol., 28, 3 (2019), Article 15, 45 pages.
[44]
D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. 2014. Machine Learning: The High Interest Credit Card of Technical Debt. In SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop).
[45]
D. Sculley, Gary Holt, D. Golovin, Eugene Davydov, Todd Phillips, D. Ebner, Vinay Chaudhary, Michael Young, J. Crespo, and Dan Dennison. 2015. Hidden Technical Debt in Machine Learning Systems. In NIPS.
[46]
C.B. Seaman. 1999. Qualitative methods in empirical studies of software engineering. IEEE Transactions on Software Engineering, 25, 4 (1999), 557–572. https://doi.org/10.1109/32.799955
[47]
C. Seaman and Y. Guo. 2011. Measuring and Monitoring Technical Debt. Advances in Computers, 82 (2011), 25–46.
[48]
Giancarlo Sierra, Emad Shihab, and Yasutaka Kamei. 2019. A survey of self-admitted technical debt. Journal of Systems and Software, 152 (2019), 70–82.
[49]
Marcelino Campos Oliveira Silva, Marco Tulio Valente, and Ricardo Terra. 2016. Does technical debt lead to the rejection of pull requests? arXiv preprint arXiv:1604.01450.
[50]
Sebastian Sztwiertnia, Maximilian Grübel, Amine Chouchane, Daniel Sokolowski, Krishna Narasimhan, and Mira Mezini. 2021. Impact of Programming Languages on Machine Learning Bugs. In Proceedings of the 1st ACM International Workshop on AI and Software Testing/Analysis (AISTA 2021). Association for Computing Machinery, New York, NY, USA. 9–12. isbn:9781450385411 https://doi.org/10.1145/3464968.3468408
[51]
Jie Tan, Daniel Feitosa, and Paris Avgeriou. 2022. Does It Matter Who Pays Back Technical Debt? An Empirical Study of Self-Fixed TD. Inf. Softw. Technol., 143, C (2022), mar, 15 pages. issn:0950-5849 https://doi.org/10.1016/j.infsof.2021.106738
[52]
Yiming Tang, Raffi Khatchadourian, Mehdi Bagherzadeh, Rhia Singh, Ajani Stewart, and Anita Raja. 2021. An Empirical Study of Refactorings and Technical Debt in Machine Learning Systems. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 238–250. https://doi.org/10.1109/ICSE43902.2021.00033
[53]
Dimitrios Tsoukalas, Miltiadis Siavvas, Marija Jankovic, Dionysios Kehagias, Alexander Chatzigeorgiou, and Dimitrios Tzovaras. 2018. Methods and Tools for TD Estimation and Forecasting: A State-of-the-art Survey. In 2018 International Conference on Intelligent Systems (IS). 698–705. https://doi.org/10.1109/IS.2018.8710521
[54]
Carmine Vassallo, Fiorella Zampetti, Daniele Romano, Moritz Beller, Annibale Panichella, Massimiliano Di Penta, and Andy Zaidman. 2016. Continuous Delivery Practices in a Large Financial Organization. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME). 519–528. https://doi.org/10.1109/ICSME.2016.72
[55]
Mohammad Wardat, Breno Dantas Cruz, Wei Le, and Hridesh Rajan. 2022. DeepDiagnosis: Automatically Diagnosing Faults and Recommending Actionable Fixes in Deep Learning Programs. In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22). Association for Computing Machinery, New York, NY, USA. 561–572. isbn:9781450392211 https://doi.org/10.1145/3510003.3510071
[56]
Mohammad Wardat, Wei Le, and Hridesh Rajan. 2021. DeepLocalize: Fault Localization for Deep Neural Networks. In Proceedings of the 43rd International Conference on Software Engineering (ICSE ’21). IEEE Press, 251–262. isbn:9781450390859 https://doi.org/10.1109/ICSE43902.2021.00034
[57]
Laerte Xavier, Fabio Ferreira, Rodrigo Brito, and Marco Tulio Valente. 2020. Beyond the Code: Mining Self-Admitted Technical Debt in Issue Tracker Systems. In Proceedings of the 17th International Conference on Mining Software Repositories (MSR ’20). Association for Computing Machinery, New York, NY, USA. 137–146. isbn:9781450375177 https://doi.org/10.1145/3379597.3387459
[58]
Fiorella Zampetti, Alexander Serebrenik, and Massimiliano Di Penta. 2018. Was Self-Admitted Technical Debt Removal a Real Removal? An In-Depth Perspective. In 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR). 526–536.
[59]
Nico Zazworka, Michele A. Shaw, Forrest Shull, and Carolyn Seaman. 2011. Investigating the Impact of Design Debt on Software Quality. In International Workshop on Managing Technical Debt. ACM, 17–23.

Cited By

View all
  • (2025)Bridging the language gap: an empirical study of bindings for open source machine learning libraries across software package ecosystemsEmpirical Software Engineering10.1007/s10664-024-10570-530:1Online publication date: 1-Feb-2025
  • (2024)An Exploratory Study on Machine Learning Model ManagementACM Transactions on Software Engineering and Methodology10.1145/368884134:1(1-31)Online publication date: 16-Aug-2024
  • (2024)Contract-based Validation of Conceptual Design Bugs for Engineering Complex Machine Learning SoftwareProceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems10.1145/3652620.3688201(155-161)Online publication date: 22-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2022
1822 pages
ISBN:9781450394130
DOI:10.1145/3540250
This work is licensed under a Creative Commons Attribution 4.0 International License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 November 2022

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. data science
  2. machine learning
  3. open-source
  4. technical debt

Qualifiers

  • Research-article

Funding Sources

Conference

ESEC/FSE '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)386
  • Downloads (Last 6 weeks)55
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Bridging the language gap: an empirical study of bindings for open source machine learning libraries across software package ecosystemsEmpirical Software Engineering10.1007/s10664-024-10570-530:1Online publication date: 1-Feb-2025
  • (2024)An Exploratory Study on Machine Learning Model ManagementACM Transactions on Software Engineering and Methodology10.1145/368884134:1(1-31)Online publication date: 16-Aug-2024
  • (2024)Contract-based Validation of Conceptual Design Bugs for Engineering Complex Machine Learning SoftwareProceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems10.1145/3652620.3688201(155-161)Online publication date: 22-Sep-2024
  • (2024)Self-Admitted Technical Debts Identification: How Far Are We?2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00087(804-815)Online publication date: 12-Mar-2024
  • (2024)A Taxonomy of Self-Admitted Technical Debt in Deep Learning Systems2024 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58944.2024.00043(388-399)Online publication date: 6-Oct-2024
  • (2024)Why and how bug blocking relations are breakableInformation and Software Technology10.1016/j.infsof.2023.107354166:COnline publication date: 4-Mar-2024
  • (2024)An empirical study on the effectiveness of large language models for SATD identification and classificationEmpirical Software Engineering10.1007/s10664-024-10548-329:6Online publication date: 1-Oct-2024
  • (2023)An Exploratory Study on the Occurrence of Self-Admitted Technical Debt in Android Apps2023 ACM/IEEE International Conference on Technical Debt (TechDebt)10.1109/TechDebt59074.2023.00007(1-10)Online publication date: May-2023
  • (2023)Unboxing Default Argument Breaking Changes in Scikit Learn2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM59687.2023.00030(209-219)Online publication date: 2-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media