Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2814270.2814304acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Automating grammar comparison

Published: 23 October 2015 Publication History

Abstract

We consider from a practical perspective the problem of checking equivalence of context-free grammars. We present techniques for proving equivalence, as well as techniques for finding counter-examples that establish non-equivalence. Among the key building blocks of our approach is a novel algorithm for efficiently enumerating and sampling words and parse trees from arbitrary context-free grammars; the algorithm supports polynomial time random access to words belonging to the grammar. Furthermore, we propose an algorithm for proving equivalence of context-free grammars that is complete for LL grammars, yet can be invoked on any context-free grammar, including ambiguous grammars. Our techniques successfully find discrepancies between different syntax specifications of several real-world languages, and are capable of detecting fine-grained incremental modifications performed on grammars. Our evaluation shows that our tool improves significantly on the existing available state of the art tools. In addition, we used these algorithms to develop an online tutoring system for grammars that we then used in an undergraduate course on computer language processing. On questions involving grammar constructions, our system was able to automatically evaluate the correctness of 95% of the solutions submitted by students: it disproved 74% of cases and proved 21% of them.

Supplementary Material

Auxiliary Archive (p183-madhavan-s.zip)
A VM containing the executable implementation of the system described in the paper Automating Grammar Comparison, and the benchmarks used in the experimental study.

References

[1]
Antlr version 4. http://www.antlr.org/.
[2]
Java 7 language specification. http://docs.oracle.com/ javase/specs/jls/se7/html/jls-18.html.
[3]
A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Princiles, Techniques, and Tools. Addison-Wesley, 1986. ISBN 0-201- 10088-6.
[4]
R. Axelsson, K. Heljanko, and M. Lange. Analyzing context-free grammars using an incremental SAT solver. In Automata, Languages and Programming, ICALP, pages 410–422, 2008.
[5]
. URL http://dx.doi.org/10.1007/ 978-3-540-70583-3_34.
[6]
C. Bastien, J. Czyzowicz, W. Fraczak, and W. Rytter. Prime normal form and equivalence of simple grammars. Theor. Comput. Sci., 363(2):124–134, 2006.
[7]
A. Bertoni, M. Goldwurm, and M. Santini. Random generation and approximate counting of ambiguously described combinatorial structures. In STACS 2000, pages 567–580. 2000.
[8]
C. Creus and G. Godoy. Automatic evaluation of context-free grammars (system description). In Rewriting and Typed Lambda Calculi RTA-TLCA, pages 139– 148, 2014.
[9]
. URL http://dx.doi.org/10.1007/ 978-3-319-08918-8_10.
[10]
B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In Foundations of Software Engineering, pages 185–194, 2007.
[11]
P. Godefroid, A. Kiezun, and M. Y. Levin. Grammar-based whitebox fuzzing. In Programming Language Design and Implementation, pages 206–215, 2008.
[12]
V. Gore, M. Jerrum, S. Kannan, Z. Sweedyk, and S. R. Mahaney. A quasi-polynomial-time algorithm for sampling words from a context-free language. Inf. Comput., 134(1):59–74, 1997.
[13]
H. Guo and Z. Qiu. Automatic grammar-based test generation. In Testing Software and Systems ICTSS, pages 17–32, 2013.
[14]
M. A. Harrison, I. M. Havel, and A. Yehudai. On equivalence of grammars through transformation trees. Theor. Comput. Sci., 9:173–205, 1979.
[15]
M. Hennessy. An analysis of rule coverage as a criterion in generating minimal test suites for grammar-based software. In Automated Software Engineering, pages 104–113, 2005.
[16]
T. J. Hickey and J. Cohen. Uniform random generation of strings in a context-free language. SIAM J. Comput., 12(4): 645–655, 1983.
[17]
A. J. Korenjak and J. E. Hopcroft. Simple deterministic languages. In Symposium on Switching and Automata Theory (Swat), pages 36–46, 1966.
[18]
D. Kozen. Automata and computability. Undergraduate texts in computer science. Springer, 1997. ISBN 978-0-387-94907-9.
[19]
I. Kuraj and V. Kuncak. Scife: Scala framework for efficient enumeration of data structures with invariants. In Scala Workshop, pages 45–49, 2014.
[20]
R. Lämmel and W. Schulte. Controllable combinatorial coverage in grammar-based testing. In Testing of Communicating Systems, TestCom, pages 19–38, 2006.
[21]
H. G. Mairson. Generating words in a context-free language uniformly at random. Inf. Process. Lett., 49(2):95–99, 1994.
[22]
R. Majumdar and R. Xu. Directed test generation using symbolic grammars. In Automated Software Engineering, pages 553–556, 2007.
[23]
B. A. Malloy. An interpretation of purdom’s algorithm for automatic generation of test cases. In International Conference on Computer and Information Science, pages 3–5, 2001.
[24]
P. M. Maurer. Generating test data with enhanced context-free grammars. IEEE Software, 7(4):50–55, 1990.
[25]
A. Nijholt. The equivalence problem for LL- and LR-regular grammars. pages 149–161, 1982.
[26]
T. Olshansky and A. Pnueli. A direct algorithm for checking equivalence of LL(k) grammars. Theor. Comput. Sci., 4(3): 321–349, 1977.
[27]
T. Parr, S. Harwell, and K. Fisher. Adaptive LL(*) parsing: the power of dynamic analysis. In Object Oriented Programming Systems Languages & Applications, OOPSLA, pages 579––598, 2014.
[28]
S. Pigeon. Pairing function. http://mathworld.wolfram. com/PairingFunction.html.
[29]
P. Purdom. A sentence generator for testing parsers. BIT Numerical Mathematics, pages 366–375, 1972.
[30]
D. J. Rosenkrantz and R. E. Stearns. Properties of deterministic top down grammars. In Symposium on Theory of Computing STOC, pages 165–180, 1969.
[31]
R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In Programming Language Design and Implementation PLDI, pages 15–26, 2013.
[32]
E. G. Sirer and B. N. Bershad. Using production grammars in software testing. In Domain-Specific Languages DSL, pages 1–13, 1999.
[33]
G. Sénizergues. L(a)=l(b)? decidability results from complete formal systems. Theoretical Computer Science, 251(1–2):1 – 166, 2001.
[34]
L. G. Valiant. Decision procedures for families of deterministic pushdown automata. Technical report, University of Warwick, Coventry, UK, 1973.
[35]
A. Warth, J. R. Douglass, and T. D. Millstein. Packrat parsers can support left recursion. In Symposium on Partial Evaluation and Semantics-based Program Manipulation, PEPM, pages 103–110, 2008.

Cited By

View all
  • (2023)Automated Ambiguity Detection in Layout-Sensitive GrammarsProceedings of the ACM on Programming Languages10.1145/36228387:OOPSLA2(1150-1175)Online publication date: 16-Oct-2023
  • (2023)Context-Bounded Verification of Context-Free SpecificationsProceedings of the ACM on Programming Languages10.1145/35712667:POPL(2141-2170)Online publication date: 11-Jan-2023
  • (2023)Symbolic encoding of LL(1) parsing and its applicationsFormal Methods in System Design10.1007/s10703-023-00420-361:2-3(338-379)Online publication date: 22-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
October 2015
953 pages
ISBN:9781450336895
DOI:10.1145/2814270
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 50, Issue 10
    OOPSLA '15
    October 2015
    953 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2858965
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Context-free grammars
  2. counter-examples
  3. equivalence
  4. proof system
  5. tutoring system

Qualifiers

  • Research-article

Funding Sources

Conference

SPLASH '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 268 of 1,244 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)3
Reflects downloads up to 06 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Automated Ambiguity Detection in Layout-Sensitive GrammarsProceedings of the ACM on Programming Languages10.1145/36228387:OOPSLA2(1150-1175)Online publication date: 16-Oct-2023
  • (2023)Context-Bounded Verification of Context-Free SpecificationsProceedings of the ACM on Programming Languages10.1145/35712667:POPL(2141-2170)Online publication date: 11-Jan-2023
  • (2023)Symbolic encoding of LL(1) parsing and its applicationsFormal Methods in System Design10.1007/s10703-023-00420-361:2-3(338-379)Online publication date: 22-Jun-2023
  • (2022)Grammar Inference for Ad Hoc ParsersCompanion Proceedings of the 2022 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity10.1145/3563768.3565550(38-42)Online publication date: 29-Nov-2022
  • (2022)Grammars for freeProceedings of the ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results10.1145/3510455.3512787(41-45)Online publication date: 21-May-2022
  • (2022)Grammars for Free: Toward Grammar Inference for Ad Hoc Parsers2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)10.1109/ICSE-NIER55298.2022.9793523(41-45)Online publication date: May-2022
  • (2021)Efficient Equivalence Checking Technique for Some Classes of Finite-State MachinesAutomatic Control and Computer Sciences10.3103/S014641162107018X55:7(670-701)Online publication date: 1-Dec-2021
  • (2021)Automatic grammar repairProceedings of the 14th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3486608.3486910(126-142)Online publication date: 17-Oct-2021
  • (2019)Spectrum-based fault localization for context-free grammarsProceedings of the 12th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3357766.3359538(15-28)Online publication date: 20-Oct-2019
  • (2017)Flatten and conquer: a framework for efficient analysis of string constraintsACM SIGPLAN Notices10.1145/3140587.306238452:6(602-617)Online publication date: 14-Jun-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media