research-article

Public Access

Perses: syntax-guided program reduction

Authors:

Zhendong SuAuthors Info & Claims

ICSE '18: Proceedings of the 40th International Conference on Software Engineering

Pages 361 - 371

https://doi.org/10.1145/3180155.3180236

Published: 27 May 2018 Publication History

Abstract

Given a program P that exhibits a certain property Ψ (e.g., a C program that crashes GCC when it is being compiled), the goal of program reduction is to minimize P to a smaller variant P′ that still exhibits the same property, i.e., Ψ(P′). Program reduction is important and widely demanded for testing and debugging. For example, all compiler/interpreter development projects need effective program reduction to minimize failure-inducing test programs to ease debugging. However, state-of-the-art program reduction techniques --- notably Delta Debugging (DD), Hierarchical Delta Debugging (HDD), and C-Reduce --- do not perform well in terms of speed (reduction time) and quality (size of reduced programs), or are highly customized for certain languages and thus lack generality.

This paper presents Perses, a novel framework for effective, efficient, and general program reduction. The key insight is to exploit, in a general manner, the formal syntax of the programs under reduction and ensure that each reduction step considers only smaller, syntactically valid variants to avoid futile efforts on syntactically invalid variants. Our framework supports not only deletion (as for DD and HDD), but also general, effective program transformations.

We have designed and implemented Perses, and evaluated it for two language settings: C and Java. Our evaluation results on 20 C programs triggering bugs in GCC and Clang demonstrate Perses's strong practicality compared to the state-of-the-art: (1) smaller size --- Perses's results are respectively 2% and 45% in size of those from DD and HDD; and (2) shorter reduction time --- Perses takes 23% and 47% time taken by DD and HDD respectively. Even when compared to the highly customized and optimized C-Reduce for C/C++, Perses takes only 38-60% reduction time.

References

[1]

Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley.

Digital Library

[2]

ANTLR. 2017. The ANTLR Parser Generator. (2017). http://www.antlr.org/, accessed: 2017-08-05.

[3]

David Binkley, Nicolas Gold, Mark Harman, Syed S. Islam, Jens Krinke, and Shin Yoo. 2014. ORBS: language-independent program slicing. In Proceedings of the 2014 ACM SIGSOFT International Symposium on Foundations of Software Engineering. 109--120.

Digital Library

[4]

GCC. 2017. A Guide to Testcase Reduction. (2017). https://gcc.gnu.org/wiki/A_guide_to_testcase_reduction, accessed: 2017-08-05.

[5]

Tony Hoare. 2003. The verifying compiler: A grand challenge for computing research. In Modular Programming Languages. Springer, 25--35.

[6]

IBM. 2017. The T.J. Watson Libraries for Analysis. (2017). http://wala.sourceforge.net/, accessed: 2017-08-05.

[7]

JavaCC. 2017. The Java Parser Generator. (2017). https://javacc.org/, accessed: 2017-08-05.

[8]

JS Delta. 2017. JS Delta. (2017). https://github.com/wala/jsdelta, accessed: 2017-08-05.

[9]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence Modulo Inputs. In Proceedings of the 2014 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).

Digital Library

[10]

Vu Le, Chengnian Sun, and Zhendong Su. 2014. Randomized Stress-Testing of Link-Time Optimizers. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA).

Digital Library

[11]

Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA, 386--399.

Digital Library

[12]

Sebastian Lekies, Ben Stock, and Martin Johns. 2013. 25 million flows later: large-scale detection of DOM-based XSS. In CCS. 1193--1204.

Digital Library

[13]

LLVM. 2017. How to submit an LLVM bug report. (2017). https://llvm.org/docs/HowToSubmitABug.html, accessed: 2017-08-05.

[14]

LLVM/Clang. {n. d.}. Clang documentation - LibTooling. ({n. d.}). https://clang.llvm.org/docs/LibTooling.html, accessed: 2017-08-06.

[15]

Scott McPeak, Daniel S. Wilkerson, and Simon Goldsmith. {n. d.}. Berkeley Delta. ({n. d.}). http://delta.tigris.org/, accessed: 2017-08-20.

[16]

Ghassan Misherghi and Zhendong Su. 2006. HDD: Hierarchical Delta Debugging. In Proceedings of the 28th International Conference on Software Engineering (ICSE '06). ACM, New York, NY, USA, 142--151.

Digital Library

[17]

John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 335--346.

Digital Library

[18]

Jibesh Patra Satia Herfert and Michael Pradel. 2017. Automatically Reducing Tree-Structured Test Inputs. In ASE. To appear.

Digital Library

[19]

Prateek Saxena, Steve Hanna, Pongsin Poosankam, and Dawn Song. 2010. FLAX: Systematic Discovery of Client-side Validation Vulnerabilities in Rich Web Applications. In NDSS.

[20]

Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2016. 849--863.

Digital Library

[21]

Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward Understanding Compiler Bugs in GCC and LLVM. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA 2016). 294--305.

Digital Library

[22]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 283--294.

Digital Library

[23]

Shin Yoo, David Binkley, and Roger D. Eastman. 2014. Seeing Is Slicing: Observation Based Slicing of Picture Description Languages. In Proceedings of the 2014 IEEE International Working Conference on Source Code Analysis and Manipulation. 175--184.

Digital Library

[24]

Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and Isolating Failure-Inducing Input. IEEE Trans. Softw. Eng. 28, 2 (Feb. 2002), 183--200.

Digital Library

[25]

Qirun Zhang, Chengnian Sun, and Zhendong Su. 2017. Skeletal program enumeration for rigorous compiler testing. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 347--361.

Digital Library

Cited By

Brown MMeily AFairservice BSood ADorn JBits TEytchison RBalzarotti DXu W(2024)A broad comparative evaluation of software debloating toolsProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699120(3927-3943)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699120
Wodiany IPop ALuján MFilkov VRay BZhou M(2024)LeanBin: Harnessing Lifting and Recompilation to Debloat BinariesProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695515(1434-1446)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695515
Xu ZTian YZhang MZhang JLiu PJiang YSun C(2024)T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical SyntaxACM Transactions on Software Engineering and Methodology10.1145/369063134:2(1-31)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1145/3690631
Show More Cited By

Index Terms

Perses: syntax-guided program reduction
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Pushing the Limit of 1-Minimality of Language-Agnostic Program Reduction

Program reduction has demonstrated its usefulness in facilitating debugging language implementations in practice, by minimizing bug-triggering programs. There are two categories of program reducers: language-agnostic program reducers (AGRs) and language-...
PPR: Pairwise Program Reduction
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Program reduction is a practical technique widely used for debugging compilers. To report a compiler bug with a bug-triggering program, one needs to minimize the program by removing bugirrelevant program elements first. Though existing program reduction ...
Ad Hoc Syntax-Guided Program Reduction
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Program reduction is a widely adopted, indispensable technique for debugging language implementations such as compilers and interpreters. Given a program 𝑃 and a bug triggered by 𝑃, a program reducer can produce a minimized program 𝑃∗ that is derived ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '18: Proceedings of the 40th International Conference on Software Engineering

May 2018

1307 pages

ISBN:9781450356381

DOI:10.1145/3180155

Conference Chair:
Michel Chaudron
Chalmers University of Technology, University of Gothenburg, Sweden
,
General Chair:
Ivica Crnkovic
Chalmers University of Technology, University of Gothenburg, Sweden
,
Program Chairs:
Marsha Chechik
University of Toronto, Canada
,
Mark Harman
Facebook and University College London, United Kingdom

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering
IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF

Conference

ICSE '18

Sponsor:

SIGSOFT
IEEE-CS

ICSE '18: 40th International Conference on Software Engineering

May 27 - June 3, 2018

Gothenburg, Sweden

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

78
Total Citations
View Citations
1,180
Total Downloads

Downloads (Last 12 months)271
Downloads (Last 6 weeks)56

Reflects downloads up to 04 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Brown MMeily AFairservice BSood ADorn JBits TEytchison RBalzarotti DXu W(2024)A broad comparative evaluation of software debloating toolsProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699120(3927-3943)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699120
Wodiany IPop ALuján MFilkov VRay BZhou M(2024)LeanBin: Harnessing Lifting and Recompilation to Debloat BinariesProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695515(1434-1446)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695515
Xu ZTian YZhang MZhang JLiu PJiang YSun C(2024)T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical SyntaxACM Transactions on Software Engineering and Methodology10.1145/369063134:2(1-31)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1145/3690631
Alhanahnah MBoshmaf YGehani ACraven RMickelson M(2024)SoK: Software Debloating Landscape and Future DirectionsProceedings of the 2024 Workshop on Forming an Ecosystem Around Software Transformation10.1145/3689937.3695792(11-18)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3689937.3695792
Corradi QWickerson JConstantinides GBöhme MNoller YSzekeres L(2024)Automated Feature Testing of Verilog Parsers using Fuzzing (Registered Report)Proceedings of the 3rd ACM International Fuzzing Workshop10.1145/3678722.3685536(70-79)Online publication date: 13-Sep-2024
https://dl.acm.org/doi/10.1145/3678722.3685536
Schwarcz FBerlakovich FBarany GMössenböck HBöhme MNoller YSzekeres L(2024)LOOL: Low-Overhead, Optimization-Log-Guided Compiler Fuzzing (Registered Report)Proceedings of the 3rd ACM International Fuzzing Workshop10.1145/3678722.3685533(42-51)Online publication date: 13-Sep-2024
https://dl.acm.org/doi/10.1145/3678722.3685533
Drosos GSotiropoulos TSpinellis DMitropoulos D(2024)Bloat beneath Python’s Scales: A Fine-Grained Inter-Project Dependency AnalysisProceedings of the ACM on Software Engineering10.1145/36608211:FSE(2584-2607)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660821
Ma HZhang WShen QTian YChen JCheung SChristakis MPradel M(2024)Towards Understanding the Bugs in Solidity CompilerProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680362(1312-1324)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680362
Zhang MTian YXu ZDong YTan SSun CChristakis MPradel M(2024)LPR: Large Language Models-Aided Program ReductionProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652126(261-273)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652126
Lu YHou WPan MLi XSu Z(2024)Understanding and Finding Java Decompiler BugsProceedings of the ACM on Programming Languages10.1145/36498608:OOPSLA1(1380-1406)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649860
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten