Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3468264.3468625acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Probabilistic Delta debugging

Published: 18 August 2021 Publication History

Abstract

The delta debugging problem concerns how to reduce an object while preserving a certain property, and widely exists in many applications, such as compiler development, regression fault localization, and software debloating. Given the importance of delta debugging, multiple algorithms have been proposed to solve the delta debugging problem efficiently and effectively. However, the efficiency and effectiveness of the state-of-the-art algorithms are still not satisfactory. For example, the state-of-the-art delta debugging tool, CHISEL, may take up to 3 hours to reduce a single program with 14,092 lines of code, while the reduced program may be up to 2 times unnecessarily large.
In this paper, we propose a probabilistic delta debugging algorithm (named ProbDD) to improve the efficiency and the effectiveness of delta debugging. Our key insight is, the ddmin algorithm, the basic algorithm upon which many existing approaches are built, follows a predefined sequence of attempts to remove elements from a sequence, and fails to utilize the information from existing test results. To address this problem, ProbDD builds a probabilistic model to estimate the probabilities of the elements to be kept in the produced result, selects a set of elements to maximize the gain of the next test based on the model, and improves the model based on the test results.
We prove the correctness of ProbDD, and analyze the minimality of its result and the asymptotic number of tests under the worst case. The asymptotic number of tests in the worst case of ProbDD is O(n), which is smaller than that of ddmin, O(n2) worst-case asymptotic number of tests. Furthermore, we experimentally compared ProbDD with ddmin on 40 subjects in HDD and CHISEL, two approaches that wrap ddmin for reducing trees and C programs, respectively. The results show that, after replacing ddmin with ProbDD, HDD and CHISEL produce 59.48% and 11.51% smaller results and use 63.22% and 45.27% less time, respectively.

References

[1]
Accessed: 2021. The implementation of modernized HDD. https://github.com/renatahodovan/picireny
[2]
Accessed: 2021. Tensorflow tutorials. https://www.tensorflow.org/guide
[3]
Accessed: 2021. xmllint. http://xmlsoft.org/xmllint.html
[4]
Rui Abreu, Alberto Gonzalez-Sanchez, and Arjan JC van Gemund. 2010. Exploiting count spectra for bayesian fault localization. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering. 1–10. https://doi.org/10.1145/1868328.1868347
[5]
Supratik Chakraborty, Dror Fried, Kuldeep S Meel, and Moshe Y Vardi. 2015. From Weighted to Unweighted Model Counting. In IJCAI. 689–695. https://dl.acm.org/doi/10.5555/2832249.2832345
[6]
Junjie Chen, Jiaqi Han, Peiyi Sun, Lingming Zhang, Dan Hao, and Lu Zhang. 2019. Compiler bug isolation via effective witness test program generation. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 223–234. https://doi.org/10.1145/3338906.3338957
[7]
Junjie Chen, Guancheng Wang, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Lu Zhang. 2019. History-guided configuration diversification for compiler test-program generation. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 305–316. https://doi.org/10.1109/ase.2019.00037
[8]
Junjie Chen, Guancheng Wang, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and XIE Bing. 2018. Coverage prediction for accelerating compiler testing. IEEE Transactions on Software Engineering, https://doi.org/10.1109/tse.2018.2889771
[9]
Arpit Christi, Matthew Lyle Olson, Mohammad Amin Alipour, and Alex Groce. 2018. Reduce before you localize: Delta-debugging and spectrum-based fault localization. In 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). 184–191. https://doi.org/10.1109/issrew.2018.00005
[10]
Holger Cleve and Andreas Zeller. 2005. Locating causes of program failures. In Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005. 342–351. https://doi.org/10.1109/icse.2005.1553577
[11]
Alastair F Donaldson, Paul Thomson, Vasyl Teliman, Stefano Milizia, André Perez Maselco, and Antoni Karpiński. 2021. Test-Case Reduction and Deduplication Almost for Free with Transformation-Based Compiler Testing. http://multicore.doc.ic.ac.uk/publications/pldi-21.html
[12]
Peter I. Frazier. 2018. A Tutorial on Bayesian Optimization. arxiv:1807.02811. arxiv:1807.02811
[13]
Kihong Heo, Woosuk Lee, Pardis Pashakhanloo, and Mayur Naik. 2018. Effective program debloating via reinforcement learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 380–394. https://doi.org/10.1145/3243734.3243838
[14]
Satia Herfert, Jibesh Patra, and Michael Pradel. 2017. Automatically reducing tree-structured test inputs. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 861–871. https://doi.org/10.1109/ase.2017.8115697
[15]
Renáta Hodován and Ákos Kiss. 2016. Modernizing hierarchical delta debugging. In Proceedings of the 7th International Workshop on Automating Test Case Design, Selection, and Evaluation. 31–37. https://doi.org/10.1145/2994291.2994296
[16]
Renáta Hodován, Ákos Kiss, and Tibor Gyimóthy. 2017. Coarse hierarchical delta debugging. In 2017 IEEE international conference on software maintenance and evolution (ICSME). 194–203. https://doi.org/10.1109/icsme.2017.26
[17]
Sunghun Kim, Thomas Zimmermann, Kai Pan, and E James Jr. 2006. Automatic identification of bug-introducing changes. In 21st IEEE/ACM international conference on automated software engineering (ASE’06). 81–90. https://doi.org/10.1109/ase.2006.23
[18]
Ákos Kiss, Renáta Hodován, and Tibor Gyimóthy. 2018. HDDr: a recursive variant of the hierarchical delta debugging algorithm. In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation. 16–22. https://doi.org/10.1145/3278186.3278189
[19]
Seongmin Lee, David Binkley, Robert Feldt, Nicolas Gold, and Shin Yoo. 2019. MOAD: Modeling Observation-based Approximate Dependency. In 2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM). 12–22. https://doi.org/10.1109/scam.2019.00011
[20]
Percy Liang, Omer Tripp, and Mayur Naik. 2011. Learning minimal abstractions. In Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 31–42. https://doi.org/10.1145/1926385.1926391
[21]
Ghassan Misherghi and Zhendong Su. 2006. HDD: hierarchical delta debugging. In Proceedings of the 28th international conference on Software engineering. 142–151. https://doi.org/10.1145/1134285.1134307
[22]
Martin Pelikan, David E Goldberg, and Erick Cantú-Paz. 1999. BOA: The Bayesian optimization algorithm. In Proceedings of the genetic and evolutionary computation conference GECCO-99. 1, 525–532. https://dl.acm.org/doi/10.5555/2933923.2933973
[23]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation. 335–346. https://doi.org/10.1145/2254064.2254104
[24]
Mandavilli Srinivas and Lalit M Patnaik. 1994. Genetic algorithms: A survey. computer, 27, 6 (1994), 17–26. https://dl.acm.org/doi/10.1109/2.294849
[25]
Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: syntax-guided program reduction. In Proceedings of the 40th International Conference on Software Engineering. 361–371. https://doi.org/10.1145/3180155.3180236
[26]
Kevin Swersky, Yulia Rubanova, David Dohan, and Kevin Murphy. 2020. Amortized bayesian optimization over discrete spaces. In Conference on Uncertainty in Artificial Intelligence. 769–778. http://proceedings.mlr.press/v124/swersky20a.html
[27]
Qi Xin, Myeongsoo Kim, Qirun Zhang, and Alessandro Orso. 2020. Program debloating via stochastic optimization. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results. 65–68. https://doi.org/10.1145/3377816.3381739
[28]
Guixin Ye, Zhanyong Tang, Shin Hwei Tan, Songfang Huang, Dingyi Fang, Xiaoyang Sun, Lizhong Bian, Haibo Wang, and Zheng Wang. 2021. Automated Conformance Testing for JavaScript Engines via Deep Compiler Fuzzing. arXiv preprint arXiv:2104.07460, arxiv:2104.07460. arxiv:2104.07460
[29]
Andreas Zeller. 1999. Yesterday, my program worked. Today, it does not. Why? ACM SIGSOFT Software engineering notes, 24, 6 (1999), 253–267. https://doi.org/10.1145/318774.318946
[30]
Andreas Zeller. 2002. Isolating cause-effect chains from computer programs. ACM SIGSOFT Software Engineering Notes, 27, 6 (2002), 1–10. https://doi.org/10.1145/587051.587053
[31]
Andreas Zeller. 2009. Why programs fail: a guide to systematic debugging. Elsevier. https://www.elsevier.com/books/why-programs-fail/zeller/978-0-08-092300-0
[32]
Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering, 28, 2 (2002), 183–200. https://doi.org/10.1109/32.988498
[33]
Yehong Zhang, Zhongxiang Dai, and Bryan Kian Hsiang Low. 2020. Bayesian optimization with binary auxiliary information. In Uncertainty in Artificial Intelligence. 1222–1232. arxiv:1807.02811. arxiv:1906.07277

Cited By

View all
  • (2025)Structuring Semantic‐Aware Relations Between Bugs and Patches for Accurate Patch EvaluationJournal of Software: Evolution and Process10.1002/smr.7000137:2Online publication date: 2-Feb-2025
  • (2024)Validity-Preserving Delta Debugging via Generator Trace ReductionACM Transactions on Software Engineering and Methodology10.1145/370530534:3(1-33)Online publication date: 23-Dec-2024
  • (2024)T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical SyntaxACM Transactions on Software Engineering and Methodology10.1145/369063134:2(1-31)Online publication date: 30-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
August 2021
1690 pages
ISBN:9781450385626
DOI:10.1145/3468264
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 August 2021

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Delta Debugging
  2. Probabilistic Model

Qualifiers

  • Research-article

Conference

ESEC/FSE '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)94
  • Downloads (Last 6 weeks)10
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Structuring Semantic‐Aware Relations Between Bugs and Patches for Accurate Patch EvaluationJournal of Software: Evolution and Process10.1002/smr.7000137:2Online publication date: 2-Feb-2025
  • (2024)Validity-Preserving Delta Debugging via Generator Trace ReductionACM Transactions on Software Engineering and Methodology10.1145/370530534:3(1-33)Online publication date: 23-Dec-2024
  • (2024)T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical SyntaxACM Transactions on Software Engineering and Methodology10.1145/369063134:2(1-31)Online publication date: 30-Aug-2024
  • (2024)LLMEffiChecker: Understanding and Testing Efficiency Degradation of Large Language ModelsACM Transactions on Software Engineering and Methodology10.1145/366481233:7(1-38)Online publication date: 26-Aug-2024
  • (2024)Calico: Automated Knowledge Calibration and Diagnosis for Elevating AI Mastery in Code TasksProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680399(1785-1797)Online publication date: 11-Sep-2024
  • (2024)Towards Understanding the Bugs in Solidity CompilerProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680362(1312-1324)Online publication date: 11-Sep-2024
  • (2024)Isolation-Based Debugging for Neural NetworksProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652132(338-349)Online publication date: 11-Sep-2024
  • (2024)C2D2: Extracting Critical Changes for Real-World Bugs with Dependency-Sensitive Delta DebuggingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652129(300-312)Online publication date: 11-Sep-2024
  • (2024)LPR: Large Language Models-Aided Program ReductionProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652126(261-273)Online publication date: 11-Sep-2024
  • (2024)Automatic Debugging of Design Faults in MapReduce ApplicationsIEEE Transactions on Software Engineering10.1109/TSE.2024.336976650:4(956-978)Online publication date: 26-Feb-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media