research-article

Spectrum-based fault localization for context-free grammars

Authors:

Moeketsi Raselimo,

Bernd FischerAuthors Info & Claims

SLE 2019: Proceedings of the 12th ACM SIGPLAN International Conference on Software Language Engineering

Pages 15 - 28

https://doi.org/10.1145/3357766.3359538

Published: 20 October 2019 Publication History

Abstract

We describe and evaluate the first spectrum-based fault localization method aimed at finding faulty rules in a context-free grammar. It takes as input a test suite and a modified parser for the grammar that can collect grammar spectra, i.e., the sets of rules used in attempts to parse the individual test cases, and returns as output a ranked list of suspicious rules. We show how grammar spectra can be collected for both LL and LR parsers, and how the ANTLR and CUP parser generators can be modified and used to automate the collection of the grammar spectra. We evaluate our method over grammars with seeded faults as well as real world grammars and student grammars submitted in compiler engineering courses that contain real faults. The results show that our method ranks the seeded faults within the top five rules in more than half of the cases and can pinpoint them in 10%–40% of the cases. On average, it ranks the faults at around 25% of all rules, and better than 15% for a very large test suite. It also allowed us to identify deviations and faults in the real world and student grammars.

References

[1]

2014. CUP 0.11b. http://www2.cs.tum.edu/projects/cup/

[2]

2018. ANTLR 4.7.2. https://www.antlr.org/

[3]

Rui Abreu, Peter Zoeteweij, and Arjan J. C. van Gemund. 2006. An Evaluation of Similarity Coefficients for Software Fault Localization. In 12th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2006), 18-20 December, 2006, University of California, Riverside, USA. IEEE Computer Society, 39–46.

Digital Library

[4]

Alfred V. Aho, Monica S. Lam Ravi Sethi, and Jeffrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (Second Edition). AddisonWesley.

[5]

Hendrikus J. S. Basten. 2010. Tracking Down the Origins of Ambiguity in Context-Free Grammars. In Theoretical Aspects of Computing - ICTAC 2010, 7th International Colloquium, Natal, Rio Grande do Norte, Brazil, September 1-3, 2010. Proceedings (Lecture Notes in Computer Science), Ana Cavalcanti, David Déharbe, Marie-Claude Gaudel, and Jim Woodcock (Eds.), Vol. 6255. Springer, 76–90.

[6]

Cédric Bastien, Jurek Czyzowicz, Wojciech Fraczak, and Wojciech Rytter. 2006. Prime normal form and equivalence of simple grammars. Theor. Comput. Sci. 363, 2 (2006), 124–134.

Digital Library

[7]

D. L. Bird and C. U. Munoz. 1983. Automatic generation of random self-checking test cases. IBM Systems Journal 22, 3 (1983), 229–245.

Digital Library

[8]

Claus Brabrand, Robert Giegerich, and Anders Møller. 2010. Analyzing ambiguity of context-free grammars. Sci. Comput. Program. 75, 3 (2010), 176–191.

[9]

David G. Cantor. 1962. On The Ambiguity Problem of Backus Systems. J. ACM 9, 4 (1962), 477–479.

Digital Library

[10]

Augusto Celentano, Stefano Crespi-Reghizzi, Pierluigi Della Vigna, Carlo Ghezzi, G. Granata, and Florencia Savoretti. 1980. Compiler Testing using a Sentence Generator. Softw., Pract. Exper. 10, 11 (1980), 897–918.

[11]

Junjie Chen, Wenxiang Hu, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. 2016. An empirical comparison of compiler testing techniques. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016, Laura K. Dillon, Willem Visser, and Laurie Williams (Eds.). ACM, 180– 190.

Digital Library

[12]

Mike Y. Chen, Emre Kiciman, Eugene Fratkin, Armando Fox, and Eric A. Brewer. 2002. Pinpoint: Problem Determination in Large, Dynamic Internet Services. In 2002 International Conference on Dependable Systems and Networks (DSN 2002), 23-26 June 2002, Bethesda, MD, USA, Proceedings. IEEE Computer Society, 595–604.

[13]

Higor Amario de Souza, Marcos Lordello Chaim, and Fabio Kon. 2016. Spectrum-based Software Fault Localization: A Survey of Techniques, Advances, and Challenges. CoRR abs/1607.04347 (2016). arXiv: 1607.04347 http://arxiv.org/abs/1607.04347

[14]

Vidroha Debroy and W. Eric Wong. 2011. On the equivalence of certain fault localization techniques. In Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21 - 24, 2011, William C. Chu, W. Eric Wong, Mathew J. Palakal, and Chih-Cheng Hung (Eds.). ACM, 1457–1463.

Digital Library

[15]

Lukas Diekmann and Laurence Tratt. 2018. Reducing Cascading Parsing Errors Through Fast Error Recovery. CoRR abs/1804.07133 (2018). arXiv: 1804.07133 http://arxiv.org/abs/1804.07133

[16]

Bernd Fischer, Ralf Lämmel, and Vadim Zaytsev. 2011. Comparison of Context-Free Grammars Based on Parsing Generated Test Data. In Software Language Engineering - 4th International Conference, SLE 2011, Braga, Portugal, July 3-4, 2011, Revised Selected Papers (Lecture Notes in Computer Science), Anthony M. Sloane and Uwe Aßmann (Eds.), Vol. 6940. Springer, 324–343.

Digital Library

[17]

Rahul Gopinath, Carlos Jensen, and Alex Groce. 2014. Mutations: How Close are they to Real Faults?. In 25th IEEE International Symposium on Software Reliability Engineering, ISSRE 2014, Naples, Italy, November 3-6, 2014. IEEE Computer Society, 189–200.

Digital Library

[18]

Kenneth V. Hanford. 1970. Automatic Generation of Test Cases. IBM Systems Journal 9, 4 (1970), 242–257.

Digital Library

[19]

Daniel Hoffman, David Ly-Gagnon, Paul A. Strooper, and Hong-Yi Wang. 2011. Grammar-based test generation with YouGen. Softw., Pract. Exper. 41, 4 (2011), 427–447.

Digital Library

[20]

Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with Code Fragments. In Proceedings of the 21th USENIX Security Symposium, Bellevue, WA, USA, August 8-10, 2012, Tadayoshi Kohno (Ed.). USENIX Association, 445–458. https://www.usenix.org/conference/ usenixsecurity12/technical-sessions/presentation/holler

Digital Library

[21]

William Homer and Richard Schooler. 1989. Independent Testing of Compiler Phases Using a Test Case Generator. Softw., Pract. Exper. 19, 1 (1989), 53–62.

Digital Library

[22]

Chinawat Isradisaikul and Andrew C. Myers. 2015. Finding counterexamples from parsing conflicts. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, David Grove and Steve Blackburn (Eds.). ACM, 555–564.

Digital Library

[23]

James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the Tarantula automatic fault-localization technique. In 20th IEEE/ACM International Conference on Automated Software Engineering (ASE 2005), November 7-11, 2005, Long Beach, CA, USA, David F. Redmiles, Thomas Ellman, and Andrea Zisman (Eds.). ACM, 273–282.

Digital Library

[24]

René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes, and Gordon Fraser. 2014. Are mutants a valid substitute for real faults in software testing?. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16 - 22, 2014, Shing-Chi Cheung, Alessandro Orso, and Margaret-Anne D. Storey (Eds.). ACM, 654–665.

Digital Library

[25]

A. J. Korenjak and John E. Hopcroft. 1966. Simple Deterministic Languages. In 7th Annual Symposium on Switching and Automata Theory, Berkeley, California, USA, October 23-25, 1966. IEEE Computer Society, 36–46.

Digital Library

[26]

Ralf Lämmel. 2001. Grammar Testing. In Fundamental Approaches to Software Engineering, 4th International Conference, FASE 2001 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2001 Genova, Italy, April 2-6, 2001, Proceedings (Lecture Notes in Computer Science), Heinrich Hußmann (Ed.), Vol. 2029. Springer, 201–216.

[27]

Ralf Lämmel and Wolfram Schulte. 2006. Controllable Combinatorial Coverage in Grammar-Based Testing. In Testing of Communicating Systems, 18th IFIP TC6/WG6.1 International Conference, TestCom 2006, New York, NY, USA, May 16-18, 2006, Proceedings (Lecture Notes in Computer Science), M. Ümit Uyar, Ali Y. Duale, and Mariusz A. Fecko (Eds.), Vol. 3964. Springer, 19–38.

Digital Library

[28]

Tien-Duy B. Le, Ferdian Thung, and David Lo. 2013. Theory and Practice, Do They Match? A Case with Spectrum-Based Fault Localization. In 2013 IEEE International Conference on Software Maintenance, Eindhoven, The Netherlands, September 22-28, 2013. IEEE Computer Society, 380–383.

Digital Library

[29]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, Michael F. P. O’Boyle and Keshav Pingali (Eds.). ACM, 216–226.

Digital Library

[30]

Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland, Martin Glinz, Gail C. Murphy, and Mauro Pezzè (Eds.). IEEE Computer Society, 3–13.

[31]

Ravichandhran Madhavan, Mikaël Mayer, Sumit Gulwani, and Viktor Kuncak. 2015. Automating grammar comparison. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, part of SPLASH 2015, Pittsburgh, PA, USA, October 25-30, 2015, Jonathan Aldrich and Patrick Eugster (Eds.). ACM, 183–200.

Digital Library

[32]

Brian A. Malloy and James F. Power. 2001. An Interpretation of Purdom’s Algorithm for Automatic Generation of Test Cases. In 1st ACIS Annual International Conference on Computer and Information Science. http://eprints.maynoothuniversity.ie/6434/

[33]

Peter M. Maurer. 1990. Generating Test Data with Enhanced ContextFree Grammars. IEEE Software 7, 4 (1990), 50–55.

Digital Library

[34]

Peter M. Maurer. 1992. The Design and Implementation of a Grammarbased Data Generator. Softw., Pract. Exper. 22, 3 (1992), 223–244.

Digital Library

[35]

William M. McKeeman. 1998. Differential Testing for Software. Digital Technical Journal 10, 1 (1998), 100–107. http://www.hpl.hp.com/ hpjournal/dtj/vol10num1/vol10num1art9.pdf

[36]

Lee Naish, Hua Jie Lee, and Kotagiri Ramamohanarao. 2011. A model for spectra-based software diagnosis. ACM Trans. Softw. Eng. Methodol. 20, 3 (2011), 11:1–11:32.

Digital Library

[37]

Anton Nijholt. 1982. The Equivalence Problem for LL- and LR-Regular Grammars. J. Comput. Syst. Sci. 24, 2 (1982), 149–161.

[38]

Akira Ochiai. 1957. Zoogeographical studies on the soleoid fishes found in Japan and its neighhouring regions-II. Bulletin of the Japanese Society of Scientific Fisheries 22, 9 (1957), 526–530.

[39]

Tmima Olshansky and Amir Pnueli. 1977. A Direct Algorithm for Checking Equivalence of LL(k) Grammars. Theor. Comput. Sci. 4, 3 (1977), 321–349.

[40]

A. J. Payne. 1978. A Formalised Technique for Expressing Compiler Exercisers. SIGPLAN Not. 13, 1 (Jan. 1978), 59–69.

Digital Library

[41]

Paul Purdom. 1972. A Sentence Generator for Testing Parsers. BIT (1972), 366–375.

[42]

Moeketsi Raselimo, Jan Taljaard, and Bernd Fischer. 2019. Breaking Parsers: Mutation-based Generation of Programs with Guaranteed Syntax Errors. In Proceedings of the 12th ACM SIGPLAN International Conference on Software Language Engineering, SLE 2019, Athens, Greece, October 21-22, 2019. This volume.

Digital Library

[43]

Manos Renieris and Steven P. Reiss. 2003. Fault Localization With Nearest Neighbor Queries. In 18th IEEE International Conference on Automated Software Engineering (ASE 2003), 6-10 October 2003, Montreal, Canada. IEEE Computer Society, 30–39.

[44]

Jesse Ruderman. 2007. Introducing jsfunfuzz. http://www.squarefree. com/2007/08/02/introducing-jsfunfuzz/

[45]

Jesse Ruderman. 2009. CSS grammar fuzzer. http://www.squarefree. com/2009/03/16/css-grammar-fuzzer/

[46]

Sylvain Schmitz. 2007. Conservative Ambiguity Detection in ContextFree Grammars. In Automata, Languages and Programming, 34th International Colloquium, ICALP 2007, Wroclaw, Poland, July 9-13, 2007, Proceedings (Lecture Notes in Computer Science), Lars Arge, Christian Cachin, Tomasz Jurdzinski, and Andrzej Tarlecki (Eds.), Vol. 4596. Springer, 692–703.

[47]

Sylvain Schmitz. 2008. An Experimental Ambiguity Detection Tool. Electr. Notes Theor. Comput. Sci. 203, 2 (2008), 69–84.

Digital Library

[48]

Friedrich Wilhelm Schröer. 2001. AMBER, An Ambiguity Checker for Context-free Grammars. http://accent.compilertools.net/Amber.html

[49]

Flash Sheridan. 2007. Practical testing of a C99 compiler using output comparison. Softw., Pract. Exper. 37, 14 (2007), 1475–1488.

Digital Library

[50]

Donald R. Slutz. 1998. Massive Stochastic Testing of SQL. In VLDB’98, Proceedings of 24rd International Conference on Very Large Data Bases, August 24-27, 1998, New York City, New York, USA, Ashish Gupta, Oded Shmueli, and Jennifer Widom (Eds.). Morgan Kaufmann, 618–622. http://www.vldb.org/conf/1998/p618.pdf

Digital Library

[51]

Ian Sommerville. 2010. Software Engineering (Ninth Edition). Pearson.

[52]

W. Eric Wong, Vidroha Debroy, Ruizhi Gao, and Yihao Li. 2014. The DStar Method for Effective Software Fault Localization. IEEE Trans. Reliability 63, 1 (2014), 290–308.

[53]

W. Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A Survey on Software Fault Localization. IEEE Trans. Software Eng. 42, 8 (2016), 707–740.

Digital Library

[54]

Xiaoyuan Xie, Tsong Yueh Chen, Fei-Ching Kuo, and Baowen Xu. 2013. A theoretical analysis of the risk evaluation formulas for spectrumbased fault localization. ACM Trans. Softw. Eng. Methodol. 22, 4 (2013), 31:1–31:40.

Digital Library

[55]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011, Mary W. Hall and David A. Padua (Eds.). ACM, 283–294.

Digital Library

[56]

Takahide Yoshikawa, Kouya Shimura, and Toshihiro Ozawa. 2003. Random Program Generator for Java JIT Compiler Test System. In 3rd International Conference on Quality Software (QSIC 2003), 6-7 November 2003, Dallas, TX, USA. IEEE Computer Society, 20.

[57]

Sergey V. Zelenov and Sophia A. Zelenova. 2005. Generation of Positive and Negative Tests for Parsers. Programming and Computer Software 31, 6 (2005), 310–320.

Digital Library

Cited By

Zhu LZhang Z(2024)Software Failure Prediction Based On Program State and First-Error CharacteristicsThe Computer Journal10.1093/comjnl/bxae02567:8(2559-2572)Online publication date: 23-Mar-2024
https://doi.org/10.1093/comjnl/bxae025
Rossouw CFischer B(2024)Grammar-based test suite construction using coverage-directed algorithms over LR-graphsJournal of Systems and Software10.1016/j.jss.2024.112068214:COnline publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1016/j.jss.2024.112068
Khorram FBousse EGarmendia AMottu JSunyé GWimmer M(2024)A language-parametric test coverage framework for executable domain-specific languagesJournal of Systems and Software10.1016/j.jss.2024.111977211(111977)Online publication date: May-2024
https://doi.org/10.1016/j.jss.2024.111977
Show More Cited By

Index Terms

Spectrum-based fault localization for context-free grammars
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. Compilers
      1. Parsers
    2. Formal language definitions
      1. Syntax
2. Theory of computation
  1. Formal languages and automata theory
    1. Grammars and context-free languages

Recommendations

Lexicalized context-free grammars
ACL '93: Proceedings of the 31st annual meeting on Association for Computational Linguistics

Lexicalized context-free grammar(LCFG) is an attractive compromise between the parsing efficiency of context-free grammar (CFG) and the elegance and lexical sensitivity of lexicalized tree adjoining grammar (LTAG). LCFG is a restricted form of LTAG that ...
LR(k)-parsing of Coupled-Context-Free Grammars
COLING '94: Proceedings of the 15th conference on Computational linguistics - Volume 1

Coupled-Context-Free Grammars are a generalization of context-free grammars obtained by combining nonterminals to parentheses which can only be substituted simultaneously. Referring to the generative capacity of the grammars we obtain an infinite ...
Ordered Context-Free Grammars
Implementation and Application of Automata
Abstract
We propose a new unambiguous grammar formalism, referred to as ordered context-free grammars, which is identical to context-free grammars, apart from the property that it also places an order on parse trees. Since only a minor modification to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SLE 2019: Proceedings of the 12th ACM SIGPLAN International Conference on Software Language Engineering

October 2019

215 pages

ISBN:9781450369817

DOI:10.1145/3357766

General Chair:
Oscar Nierstrasz
University of Bern, Switzerland
,
Program Chairs:
Jeff Gray
University of Alabama, USA
,
Bruno C. d. S. Oliveira
University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tag

Spectrum-based fault localization

Qualifiers

Research-article

Funding Sources

National Research Foundation

Conference

SLE '19

Sponsor:

SIGPLAN

SLE '19: 12th ACM SIGPLAN International Conference on Software Language Engineering

October 20 - 22, 2019

Athens, Greece

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
235
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhu LZhang Z(2024)Software Failure Prediction Based On Program State and First-Error CharacteristicsThe Computer Journal10.1093/comjnl/bxae02567:8(2559-2572)Online publication date: 23-Mar-2024
https://doi.org/10.1093/comjnl/bxae025
Rossouw CFischer B(2024)Grammar-based test suite construction using coverage-directed algorithms over LR-graphsJournal of Systems and Software10.1016/j.jss.2024.112068214:COnline publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1016/j.jss.2024.112068
Khorram FBousse EGarmendia AMottu JSunyé GWimmer M(2024)A language-parametric test coverage framework for executable domain-specific languagesJournal of Systems and Software10.1016/j.jss.2024.111977211(111977)Online publication date: May-2024
https://doi.org/10.1016/j.jss.2024.111977
Callaghan DFischer BJust RFraser G(2023)Improving Spectrum-Based Localization of Multiple Faults by Iterative Test Suite ReductionProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598148(1445-1457)Online publication date: 12-Jul-2023
https://dl.acm.org/doi/10.1145/3597926.3598148
Zheng WChen JLu ZYang FXiao PFan X(2023)A Fault Localization Technique for Online Programming Learning2023 3rd International Symposium on Computer Technology and Information Science (ISCTIS)10.1109/ISCTIS58954.2023.10213027(486-491)Online publication date: 7-Jul-2023
https://doi.org/10.1109/ISCTIS58954.2023.10213027
Khorram FBousse EGarmendia AMottu JSunyé GWimmer MFischer BBurgueño LCazzola W(2022)From Coverage Computation to Fault Localization: A Generic Framework for Domain-Specific LanguagesProceedings of the 15th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3567512.3567532(235-248)Online publication date: 29-Nov-2022
https://dl.acm.org/doi/10.1145/3567512.3567532
Raselimo MFischer BVisser EKolovos DSöderberg E(2021)Automatic grammar repairProceedings of the 14th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3486608.3486910(126-142)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3486608.3486910
Rossouw CFischer BVisser EKolovos DSöderberg E(2021)Vision: bias in systematic grammar-based test suite construction algorithmsProceedings of the 14th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3486608.3486902(143-149)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3486608.3486902
Lou YZhu QDong JLi XSun ZHao DZhang LZhang LSpinellis DGousios GChechik MDi Penta M(2021)Boosting coverage-based fault localization via graph-based representation learningProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468580(664-676)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3468264.3468580
Barraball CRaselimo MFischer BLämmel RTratt Lde Lara J(2020)An interactive feedback system for grammar development (tool paper)Proceedings of the 13th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3426425.3426935(101-107)Online publication date: 16-Nov-2020
https://dl.acm.org/doi/10.1145/3426425.3426935
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten