PAVT: a tool to visualize and teach parsing algorithms

Sangal, Somya; Kataria, Shreya; Tyagi, Twishi; Gupta, Nidhi; Kirtani, Yukti; Agrawal, Shivli; Chakraborty, Pinaki

doi:10.1007/s10639-018-9739-x

PAVT: a tool to visualize and teach parsing algorithms

Published: 24 May 2018

Volume 23, pages 2737–2764, (2018)
Cite this article

Education and Information Technologies Aims and scope Submit manuscript

Somya Sangal¹,
Shreya Kataria¹,
Twishi Tyagi¹,
Nidhi Gupta²,
Yukti Kirtani¹,
Shivli Agrawal¹ &
…
Pinaki Chakraborty ORCID: orcid.org/0000-0002-2010-8022¹

468 Accesses
9 Citations
Explore all metrics

Abstract

A parsing algorithm visualizer is a tool that visualizes the construction of a parser for a given context-free grammar and then illustrates the use of that parser to parse a given string. Parsing algorithm visualizers are used to teach the course on compiler construction which in invariably included in all undergraduate computer science curricula. This paper presents a new parsing algorithm visualizer that can visualize six parsing algorithms, viz. predictive parsing, simple LR parsing, canonical LR parsing, look-ahead LR parsing, Earley parsing and CYK parsing. The tool logically explains the process of parsing showing the calculations involved in each step. The output of the tool has been structured to maximize the learning outcomes and contains important constructs like FIRST and FOLLOW sets, item sets, parsing table, parse tree and leftmost or rightmost derivation depending on the algorithm being visualized. The tool has been used to teach the course on compiler construction at both undergraduate and graduate levels. An overall positive feedback was received from the students with 89% of them saying that the tool helped them in understanding the parsing algorithms. The tool is capable of visualizing multiple parsing algorithms and 88% students used it to compare the algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 6

Fig. 8

Visual exploration of visual parser execution

Article Open access 18 March 2021

Visualizing Data Flows in Computer Graphics Programs for Code Comprehension and Debugging

AlgoTutor: An Integrated Learning Platform for Data Structures and Algorithms with Real-Time Guidance and Interactive Visualizations

Discover the latest articles, news and stories from top researchers in related subjects.

Digital Education and Educational Technology

References

Adams, D. R., & Trefftz, C. (2004). Using XML in a compiler course. inroads – ACM SIGCSE Bulletin, 36, 4–6.
Article Google Scholar
Aiken, A. (1996). Cool: A portable project for teaching compiler construction. ACM SIGPLAN Notices, 31, 19–24.
Article Google Scholar
Almeida-Martínez, F. J., Urquiza-Fuentes, J., & Velázquez-Iturbide, J. A. (2008). VAST: Visualization of abstract syntax trees within language processors courses. Proceedings of the Fourth ACM Symposium on Software Visualization, 209–210.
Andrews, K., Henry, R. R., & Yamamoto, W. K. (1988). Design and implementation of the UW illustrated compiler. ACM SIGPLAN Notices, 23, 105–114.
Article Google Scholar
Barnard, A. C. L. (1975). Planning and experience with a one quarter course on compiler writing using Gries’ book and structured programming. ACM SIGCSE Bulletin, 7, 27–29.
Article Google Scholar
Blythe, S. A., James, M. C., & Rodger, S. H. (1994). LLparse and LRparse: Visual and interactive tools for parsing. ACM SIGCSE Bulletin, 26, 208–212.
Article Google Scholar
Chakraborty, P., Saxena, P. C., Katti, C. P., Pahwa, G., & Taneja, S. (2014). A new practicum in compiler construction. Computer Applications in Engineering Education, 22, 429–441.
Article Google Scholar
Chanon, R. N. (1975). Compiler construction in an undergraduate course: Some difficulties. ACM SIGCSE Bulletin, 7, 30–32.
Article Google Scholar
Corliss, M. L., Furcy, D., Davis, J., & Pietraszek, L. (2010). Bantam Java compiler project: experiences and extensions. Journal of Computing Sciences in Colleges, 25, 159–166.
Google Scholar
de Oliveira Guimarães, J. (2007). Learning compiler construction by examples. inroads – ACM. SIGCSE Bulletin, 39, 70–74.
Article Google Scholar
Debray, S. (2002). Making compiler design relevant for students who will (most likely) never design a compiler. inroads – ACM SIGCSE Bulletin, 34, 341–345.
Article Google Scholar
Elsworth, E. F. (1992). The MSL compiler writing project. ACM SIGCSE Bulletin, 24, 41–44.
Article Google Scholar
Jain, A., Goyal, A., & Chakraborty, P. (2017). PPVT: A tool to visualize predictive parsing. ACM Inroads, 8, 47–51.
Article Google Scholar
Kaplan, A., & Shoup, D. (2000). CUPV – A visualization tool for generated parsers. ACM SIGCSE Bulletin, 32, 11–15.
Article Google Scholar
Khuri, S., & Sugono, Y. (1998). Animating parsing algorithms. ACM SIGCSE Bulletin, 30, 232–236.
Article Google Scholar
Lovato, M. E., & Kleyn, M. F. (1995). Parser visualizations for developing grammars with YACC. ACM SIGCSE Bulletin, 27, 345–349.
Article Google Scholar
McMahon, I. C. (2014). Improving the capabilities of JFLAP: Creating effective user interfaces in learning for theoretical computer science. Undergraduate thesis, Duke University.
Mernik, M., & Zumer, V. (2003). An educational tool for teaching compiler construction. IEEE Transactions on Education, 46, 61–68.
Article Google Scholar
Resler, R. D., & Deaver, D. M. (1998). VCOCO: A visualisation tool for teaching compilers. ACM SIGCSE Bulletin, 30, 199–202.
Article Google Scholar
Resler, D., & O’Sullivan, K. (1990). VisiCLANG – A visible compiler for CLANG. ACM SIGPLAN Notices, 25, 120–123.
Article Google Scholar
Rodger, S. H., & Finley, T. W. (2006). JFLAP: An interactive formal languages and automata package. Jones and Bartlett.
Shapiro, H. D., & Mickunas, M. D. (1976). A new approach to teaching a first course in compiler construction. ACM SIGCSE Bulletin, 8, 158–166.
Article Google Scholar
Sierra, J.-L., Fernández-Pampillon, A. M., & Fernández-Valmayor, A. (2008). An environment for supporting active learning in courses on language processing. ACM SIGCSE Bulletin, 40, 128–132.
Article Google Scholar
Temte, M. C. (1992). A compiler construction project for an object oriented language. ACM SIGCSE Bulletin, 24, 138–141.
Article Google Scholar
Vegdahl, S. R. (2001). Using visualization tools to teach compiler design. Journal of Computing Sciences in Colleges, 16, 72–83.
Google Scholar
White, E. L., Ruby, J., & Deddens, L. D. (1999). Software visualization of LR parsing and synthesized attribute evaluation. Software: Practice and Experience, 29, 1–16.
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Computer Engineering, Netaji Subhas Institute of Technology, New Delhi, 110078, India
Somya Sangal, Shreya Kataria, Twishi Tyagi, Yukti Kirtani, Shivli Agrawal & Pinaki Chakraborty
Department of Computer Science, Hansraj College, Delhi, 110007, India
Nidhi Gupta

Authors

Somya Sangal
View author publications
You can also search for this author in PubMed Google Scholar
Shreya Kataria
View author publications
You can also search for this author in PubMed Google Scholar
Twishi Tyagi
View author publications
You can also search for this author in PubMed Google Scholar
Nidhi Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Yukti Kirtani
View author publications
You can also search for this author in PubMed Google Scholar
Shivli Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Pinaki Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pinaki Chakraborty.

APPENDIX

1.1 Predictive parsing

Given below is a sample output of the module to visualize the predictive parsing algorithm. We provide the grammar given in Section 2 and the string “(i + i)*i” as input. The grammar is left-recursive but does not have any scope for left-factoring. The module first displays the LL(1) grammar obtained after eliminating left-recursion. The module then displays FIRST sets for the 5 terminals and the 5 nonterminals, and FOLLOW sets for the 5 nonterminals. The module then displays the parsing table. The parsing table has 5 rows, one for each nonterminal, and 6 columns, one for each terminal and the ‘$’ character. A cell in the parsing table either contains a production rule or is empty. Then the module illustrates the table-driven parsing of the string. In each step, the part of the string that has been already matched, the content of the parsing stack, the part of the string yet to be matched and the production rule used in that step are displayed. The process ends with the acceptance of the string when the entire string has been matched and the parsing stack contains only the ‘$’ character. Since the parsing process has resulted in the acceptance of the string, the module displays the parse tree. Since predictive parsing follows leftmost derivation, the module at last displays the leftmost derivation of the string from the grammar start symbol.

1.2 Simple LR parsing

Given below is the output of the module to visualize the simple LR parsing algorithm for the grammar given in Section 2 and the string “(i + i)*i”. The module first displays the augmented grammar with Z being the new start symbol. The module then displays FIRST sets for the 5 terminals and the 4 nonterminals, and FOLLOW sets for the 4 nonterminals. The module then displays the item sets. The canonical collection contains 12 item sets, viz. I0 to I11. The module then displays the parsing table. The parsing table has 12 rows corresponding to the states constructed from the 12 item sets. The ACTION part of the parsing table has 6 columns, one for each terminal and the ‘$’ character, while the GOTO part of the parsing table has 3 columns, one for each nonterminal except Z. A cell in the ACTION part either specifies the action to be performed or is empty. A cell containing ‘sj’ specifies that the state j has to be pushed onto the parsing stack. A cell containing ‘rj’ specifies that the jth production rule has to be used to reduce. A cell in the GOTO part either gives the next state or is empty. The module then illustrates the table-driven parsing of the string. In each step, the content of the parsing stack, the part of the string yet to be matched, the action performed in that step and the production rule used, if the action was to reduce, are displayed. Since the parsing process results in the acceptance of the string, the module displays the parse tree. Note that the parse trees displayed by the modules visualizing predictive parsing and simple LR parsing are different. This is because of the different preprocessing techniques required by the two algorithms. Since simple LR parsing follows rightmost derivation in reverse, the module at last displays the rightmost derivation in reverse order.

1.3 Canonical LR parsing

Given below is the output of the module to visualize the canonical LR parsing algorithm for the grammar given in Section 2 and the string “(i + i)*i”. The module first displays the augmented grammar, with Z being the new start symbol, and then FIRST sets for the 5 terminals and the 4 nonterminals. The module then displays the item sets. The canonical collection contains 22 item sets, viz. I0 to I21. The module then displays the parsing table. The parsing table is like the one used for simple LR parsing except for the fact that it contains 22 rows. The number of rows in the parsing table can be more even by an order of magnitude in the case of canonical LR parsing than in the case of simple LR parsing for a satisfactorily large grammar. The module then illustrates the table-driven parsing of the string. The number of steps in the parsing process is exactly same as that in the case of simple LR parsing. The module displays the parse tree and the rightmost derivation in reverse order, which are again same as those in the case of simple LR parsing.

1.4 Look-ahead LR parsing

The output of the module to visualize the look-ahead LR parsing algorithm is similar to that of the module to visualize canonical LR parsing for the same grammar and the same string. However, due to the merging of item sets there are fewer rows in the parsing table. In fact, the number of rows in the parsing table is same for simple LR parsing and look-ahead LR parsing for a given grammar.

1.5 Earley parsing

Given below is the output of the module to visualize the Earley parsing algorithm for the grammar given in Section 2 and the string “(i + i)*i”. The module displays the augmented grammar, with Z being the new start symbol. The module then displays the 8 Earley item sets, viz. S[0] to S[7]. Since S[7] contains the item [Z- > E.,0], the string is accepted.

1.6 CYK parsing

Given below is the output of the module to visualize the CYK parsing algorithm for the same grammar and the same string. The module displays the grammar after it has been converted to CNF. As many as 8 new nonterminals has been introduced in the grammar, viz. R, S, U, V, W, X, Y and Z, with Z being the new start symbol. The module then displays the binary matrices. Since there are 7 terminals in the string, 7 binary matrices are printed. A value of 1 in the jth row and kth column of the ith binary matrix means that a string of i terminals can be derived from the jth nonterminal starting at the kth terminal. The string is accepted because the cell corresponding to Z and ‘(‘in the seventh binary matrix has a value of 1. The module also displays the same information using a lower triangular matrix. A nonterminal in the ith row from the bottom and the kth column in this lower triangular matrix can derive a string of i terminals starting with the kth. The presence of Z, the grammar start symbol, in the seventh row from the bottom and first column denotes the acceptance of the string.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sangal, S., Kataria, S., Tyagi, T. et al. PAVT: a tool to visualize and teach parsing algorithms. Educ Inf Technol 23, 2737–2764 (2018). https://doi.org/10.1007/s10639-018-9739-x

Download citation

Received: 01 January 2018
Accepted: 08 May 2018
Published: 24 May 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s10639-018-9739-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PAVT: a tool to visualize and teach parsing algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Visual exploration of visual parser execution

Visualizing Data Flows in Computer Graphics Programs for Code Comprehension and Debugging

AlgoTutor: An Integrated Learning Platform for Data Structures and Algorithms with Real-Time Guidance and Interactive Visualizations

References