Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3377811.3380378acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Repairing deep neural networks: fix patterns and challenges

Published: 01 October 2020 Publication History

Abstract

Significant interest in applying Deep Neural Network (DNN) has fueled the need to support engineering of software that uses DNNs. Repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. What challenges should automated repair tools address? What are the repair patterns whose automation could help developers? Which repair patterns should be assigned a higher priority for building automated bug repair tools? This work presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from GitHub for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns; the most common bug fix patterns are fixing data dimension and neural network connectivity; DNN bug fixes have the potential to introduce adversarial vulnerabilities; DNN bug fixes frequently introduce new bugs; and DNN bug localization, reuse of trained model, and coping with frequent releases are major challenges faced by developers when fixing bugs. We also contribute a benchmark of 667 DNN (bug, repair) instances.

References

[1]
Adrian Bachmann, Christian Bird, Foyzur Rahman, Premkumar Devanbu, and Abraham Bernstein. 2010. The missing links: bugs and bug-fix commits. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering. ACM, 97--106.
[2]
Battista Biggio, Blaine Nelson, and Pavel Laskov. 2012. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389 (2012).
[3]
Aline Brito, Laerte Xavier, Andre Hora, and Marco Tulio Valente. 2018. Why and how Java developers break APIs. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 255--265.
[4]
Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 39--57.
[5]
Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017).
[6]
Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler. 2001. An empirical study of operating systems errors. In ACM SIGOPS Operating Systems Review, Vol. 35. ACM, 73--88.
[7]
Jens Dietrich, Kamil Jezek, and Premek Brada. 2014. Broken promises: An empirical study into evolution problems in java programs caused by library upgrades. In 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE). IEEE, 64--73.
[8]
Felipe Ebert, Fernando Castor, and Alexander Serebrenik. 2015. An exploratory study on exception handling bugs in Java programs. Journal of Systems and Software 106 (2015), 82--101.
[9]
Jerome H Friedman. 1997. On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data mining and knowledge discovery 1, 1 (1997), 55--77.
[10]
Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
[11]
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). ACM, New York, NY, USA, 510--520.
[12]
Md Johirul Islam, Hoan Anh Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. What Do Developers Ask About ML Libraries? A Large-scale Study Using Stack Overflow. arXiv preprint arXiv:1906.11940 (2019).
[13]
Md Johirul Islam, Rangeet Pan, Giang Nguyen, and Hridesh Rajan. 2020. A Benchmark for Bugs and Fix Patterns for Deep Neural Networks. https://github.com/lab-design/ICSE2020DNNBugRepair.
[14]
Katarzyna Janocha and Wojciech Marian Czarnecki. 2017. On loss functions for deep neural networks in classification. arXiv preprint arXiv:1702.05659 (2017).
[15]
Kamil Jezek, Jens Dietrich, and Premek Brada. 2015. How Java APIs break-an empirical study. Information and Software Technology 65 (2015), 129--146.
[16]
Haifeng Jin, Qingquan Song, and Xia Hu. 2018. Auto-keras: Efficient neural architecture search with network morphism. arXiv preprint arXiv:1806.10282 (2018).
[17]
Ian Jolliffe. 2011. Principal component analysis. Springer.
[18]
Zhenmin Li, Lin Tan, Xuanhui Wang, Shan Lu, Yuanyuan Zhou, and Chengxiang Zhai. 2006. Have things changed now?: an empirical study of bug characteristics in modern open source software. In Proceedings of the 1st workshop on Architectural and system support for improving software dependability. ACM, 25--33.
[19]
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In ACM SIGARCH Computer Architecture News, Vol. 36. ACM, 329--339.
[20]
Wanwangying Ma, Lin Chen, Xiangyu Zhang, Yuming Zhou, and Baowen Xu. 2017. How do developers fix cross-project correlated bugs? A case study on the GitHub scientific Python ecosystem. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 381--392.
[21]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579--2605.
[22]
Hoan Anh Nguyen, Robert Dyer, Tien N Nguyen, and Hridesh Rajan. 2014. Mining preconditions of APIs in large-scale code corpus. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 166--177.
[23]
Kai Pan, Sunghun Kim, and E James Whitehead. 2009. Toward an understanding of bug fix patterns. Empirical Software Engineering 14, 3 (2009), 286--315.
[24]
Rangeet Pan, Md Johirul Islam, Shibbir Ahmed, and Hridesh Rajan. 2019. Identifying Classes Susceptible to Adversarial Attacks. arXiv preprint arXiv:1905.13284 (2019).
[25]
Jihun Park, Miryung Kim, Baishakhi Ray, and Doo-Hwan Bae. 2012. An empirical study of supplementary bug fixes. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories. IEEE Press, 40--49.
[26]
Hung Viet Pham, Thibaud Lutellier, Weizhen Qi, and Lin Tan. 2019. CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In Proceedings of the 41st International Conference on Software Engineering. IEEE Press, 1027--1038.
[27]
Hridesh Rajan. 2020. D4 (Dependable Data Driven Discovery) Framework. Technical Report. Iowa State University.
[28]
Ripon Saha, Yingjun Lyu, Wing Lam, Hiroaki Yoshida, and Mukul Prasad. 2018. Bugs. jar: A large-scale, diverse dataset of real-world Java bugs. In 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR). IEEE, 10--13.
[29]
Sean Saito and Sujoy Roy. 2018. Effects of Loss Functions And Target Representations on Adversarial Robustness. arXiv preprint arXiv:1812.00181 (2018).
[30]
David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. 2014. Machine learning: The high interest credit card of technical debt. (2014).
[31]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929--1958.
[32]
Xiaobing Sun, Tianchi Zhou, Gengjie Li, Jiajun Hu, Hui Yang, and Bin Li. 2017. An empirical study on real bugs for machine learning programs. In 2017 24th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 348--357.
[33]
TensorFlow. 2019. TensorFlow Github. https://github.com/tensorflow/tensorflow/tags/.
[34]
Ferdian Thung, Shaowei Wang, David Lo, and Lingxiao Jiang. 2012. An empirical study of bugs in machine learning systems. In 2012 IEEE 23rd International Symposium on Software Reliability Engineering. IEEE, 271--280.
[35]
Gias Uddin, Barthélémy Dagenais, and Martin P Robillard. 2012. Temporal analysis of API usage concepts. In Proceedings of the 34th International Conference on Software Engineering. IEEE Press, 804--814.
[36]
Anthony J Viera, Joanne M Garrett, et al. 2005. Understanding interobserver agreement: the kappa statistic. Fam med 37, 5 (2005), 360--363.
[37]
Qixue Xiao, Kang Li, Deyue Zhang, and Yier Jin. 2017. Wolf in Sheep's Clothing-The Downscaling Attack Against Deep Learning Applications. arXiv preprint arXiv:1712.07805 (2017).
[38]
Cihang Xie, Jianyu Wang, Zhishuai Zhang, Zhou Ren, and Alan Yuille. 2017. Mitigating adversarial effects through randomization. arXiv preprint arXiv:1711.01991 (2017).
[39]
Shahed Zaman, Bram Adams, and Ahmed E Hassan. 2011. Security versus performance bugs: a case study on firefox. In Proceedings of the 8th working conference on mining software repositories. ACM, 93--102.
[40]
Tianyi Zhang, Cuiyun Gao, Lei Ma, Michael R. Lyu, and Miryung Kim. 2019. An Empirical Study of Common Challenges in Developing Deep Learning Applications. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE.
[41]
Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 129--140.
[42]
Hao Zhong and Zhendong Su. 2015. An empirical study on real bug fixes. In Proceedings of the 37th International Conference on Software Engineering-Volume 1. IEEE Press, 913--923.

Cited By

View all
  • (2024)Mutation-Based Deep Learning Framework Testing Method in JavaScript EnvironmentProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695478(970-981)Online publication date: 27-Oct-2024
  • (2024)A Conceptual Framework for Quality Assurance of LLM-based Socio-critical SystemsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695306(2314-2318)Online publication date: 27-Oct-2024
  • (2024)Keeper: Automated Testing and Fixing of Machine Learning SoftwareACM Transactions on Software Engineering and Methodology10.1145/367245133:7(1-33)Online publication date: 13-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
June 2020
1640 pages
ISBN:9781450371216
DOI:10.1145/3377811
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • KIISE: Korean Institute of Information Scientists and Engineers
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2020

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. bug fix
  2. bug fix patterns
  3. bugs
  4. deep neural networks

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)420
  • Downloads (Last 6 weeks)69
Reflects downloads up to 02 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mutation-Based Deep Learning Framework Testing Method in JavaScript EnvironmentProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695478(970-981)Online publication date: 27-Oct-2024
  • (2024)A Conceptual Framework for Quality Assurance of LLM-based Socio-critical SystemsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695306(2314-2318)Online publication date: 27-Oct-2024
  • (2024)Keeper: Automated Testing and Fixing of Machine Learning SoftwareACM Transactions on Software Engineering and Methodology10.1145/367245133:7(1-33)Online publication date: 13-Jun-2024
  • (2024)Mining Fix Patterns for System Interaction BugsProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671398(367-376)Online publication date: 24-Jul-2024
  • (2024)A Miss Is as Good as A Mile: Metamorphic Testing for Deep Learning OperatorsProceedings of the ACM on Software Engineering10.1145/36607961:FSE(2005-2027)Online publication date: 12-Jul-2024
  • (2024)Characterizing Deep Learning Package Supply Chains in PyPI: Domains, Clusters, and DisengagementACM Transactions on Software Engineering and Methodology10.1145/364033633:4(1-27)Online publication date: 10-Jan-2024
  • (2024)Beyond Accuracy: An Empirical Study on Unit Testing in Open-source Deep Learning ProjectsACM Transactions on Software Engineering and Methodology10.1145/363824533:4(1-22)Online publication date: 18-Apr-2024
  • (2024)A Post-training Framework for Improving the Performance of Deep Learning Models via Model TransformationACM Transactions on Software Engineering and Methodology10.1145/363001133:3(1-41)Online publication date: 15-Mar-2024
  • (2024)Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating CodeProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639226(1-13)Online publication date: 20-May-2024
  • (2024)Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in DeploymentProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623333(1-13)Online publication date: 20-May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media