0% found this document useful (0 votes)

29 views

Compiler Design Unit 2

Uploaded by

Priya Rana

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

Compiler Design Unit 2

Uploaded by

Priya Rana

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

lOMoARcPSD|20951282

Compiler Design Unit-2

Computer science engineering (I. K. Gujral Punjab Technical University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

UNIT –II
Syntax Analysis-:The Role of a parser, Context free Grammars, Writing A grammar, top down parsing
bottom up parsing, Introduction to Lr Parser.

UNIT -2

SYNTAX ANALYSIS

2.1 ROLE OF THE PARSER

Parser obtains a string of tokens from the lexical analyzer and verifies that it can be generated
by the language for the source program. The parser should report any syntax errors in an
intelligible fashion. The two types of parsers employed are:
1.Top down parser: which build parse trees from top(root) to bottom(leaves)
2.Bottom up parser: which build parse trees from leaves and work up the root.
Therefore there are two types of parsing methods– top-down parsing and bottom-up parsing

2.2 Context free Grammars(CFG)

CFG is used to specify the syntax of a language. A grammar naturally describes the
hierarchical structure of most program-ming language constructs.

Formal Definition of Grammars

A context-free grammar has four components:
1. A set of terminal symbols, sometimes referred to as "tokens." The terminals are the elementary
symbols of the language defined by the grammar.
2. A set of nonterminals, sometimes called "syntactic variables." Each non-terminal represents a set of
strings of terminals, in a manner we shall describe.
3. A set of productions, where each production consists of a nonterminal, called the head or left side
of the production, an arrow, and a sequence of terminals and1or nonterminals, called
the body or right side of the production. The intuitive intent of a production is to specify one of the
written forms of a construct; if the head nonterminal represents a construct, then the body
represents a written form of the construct.

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

4. A designation of one of the nonterminals as the start symbol.
Production is for a nonterminal if the nonterminal is the head of the production. A string of
terminals is a sequence of zero or more terminals. The string of zero terminals, written as E , is called
the empty string.
Derivations
A grammar derives strings by beginning with the start symbol and repeatedly replacing a nonterminal
by the body of a production for that nonterminal. The terminal strings that can be derived from the start
symbol form the language defined by the grammar.

Leftmost and Rightmost Derivation of a String

• Leftmost derivation − A leftmost derivation is obtained by applying production to the leftmost

variable in each step.
• Rightmost derivation − A rightmost derivation is obtained by applying production to the
rightmost variable in each step.
• Example
Let any set of production rules in a CFG be
X → X+X | X*X |X| a
over an alphabet {a}.
The leftmost derivation for the string "a+a*a" is
X → X+X → a+X → a + X*X → a+a*X → a+a*a
The rightmost derivation for the above string "a+a*a" is
X → X*X → X*a → X+X*a → X+a*a → a+a*a

Derivation or Yield of a Tree

The derivation or the yield of a parse tree is the final string obtained by concatenating the labels of the
leaves of the tree from left to right, ignoring the Nulls. However, if all the leaves are Null, derivation is
Null.
parse tree pictorially shows how the start symbol of a grammar derives a string in the language. If
nonterminal A has a production A → XYZ , then a parse tree may have an interior node labeled A with
three children labeled X, Y, and Z, from left to right:

Given a context-free grammar, a parse tree according to the grammar is a tree with the following
properties:
1. The root is labeled by the start symbol.
2. Each leaf is labeled by a terminal or by e.
3. Each interior node is labeled by a nonterminal
JSVG Krishna, Associate Professor.
Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

If A is the nonterminal labeling some interior node and X I , Xz, . . . ,Xn are the labels of the children
of that node from left to right, then there must be a production A → X1X2 . . Xn . Here, X1,
X2,. . . , Xn, each stand for a symbol that is either a terminal or a nonterminal. As a special case,
if A → c is a production, then a node labeled A may have a single child labeled E

Ambiguity
A grammar can have more than one parse tree generating a given string of terminals. Such a grammar is
said to be ambiguous. To show that a grammar is ambiguous, all we need to do is find a terminal string
that is the yield of more than one parse tree. Since a string with more than one parse tree usually has
more than one meaning, we need to design unambiguous grammars for compiling applications, or to use
ambiguous grammars with additional rules to resolve the ambiguities.
Example :: Suppose we used a single nonterminal string and did not distinguish between digits and
lists,

Fig. shows that an expression like 9-5+2 has more than one parse tree with this grammar. The two trees
for 9-5+2 correspond to the two ways of parenthesizing the expression: (9-5) +2 and 9- (5+2) . This
second parenthesization gives the expression the unexpected value 2 rather than the customary value 6.

Two parse trees for 9-5+2

Verifying the language generated by a grammar

The set of all strings that can be derived from a grammar is said to be the language generated from that
grammar. A language generated by a grammar G is a subset formally defined by
L(G)={W|W ∈ ∑*, S ⇒G W}
If L(G1) = L(G2), the Grammar G1 is equivalent to the Grammar G2.

Example
If there is a grammar
G: N = {S, A, B} T = {a, b} P = {S → AB, A → a, B → b}
Here S produces AB, and we can replace A by a, and B by b. Here, the only accepted string is ab, i.e.,
L(G) = {ab}

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

2.3 Writing a grammar

A grammar consists of a number of productions. Each production has an abstract symbol
called a nonterminal as its left-hand side, and a sequence of one or more nonterminal
and terminal symbols as its right-hand side. For each grammar, the terminal symbols are drawn from a
specified alphabet.
Starting from a sentence consisting of a single distinguished nonterminal, called
the goal symbol, a given context-free grammar specifies a language, namely, the set of
possible sequences of terminal symbols that can result from repeatedly replacing any nonterminal in
the sequence with a right-hand side of a production for which the nonterminal is the left-hand side.
There are four categories in writing a grammar :
1. Lexical Vs Syntax Analysis
2. Eliminating ambiguous grammar.
3. Eliminating left-recursion
4. Left-factoring.
Each parsing method can handle grammars only of a certain form hence, the initial grammar may have
to be rewritten to make it parsable

1. Lexical Vs Syntax Analysis

Reasons for using the regular expression to define the lexical syntax of a language

a) Regular expressions provide a more concise and easier to understand notation for tokens than
grammars.
b) The lexical rules of a language are simple and to describe them, we donot need notation as
powerful as grammars.
c) Efficient lexical analyzers can be constructed automatically from RE than from grammars.
d) Separating the syntactic structure of a language into lexical and nonlexical parts provides a
convenient way of modularizing the front end into two manageable-sized components.

2. Eliminating ambiguous grammar.

Ambiguity of the grammar that produces more than one parse tree for leftmost or rightmost
derivation can be eliminated by re-writing the grammar.
Consider this example,
G: stmt→if expr then stmt
|if expr then stmt else stmt
|other
This grammar is ambiguous since the string if E1 then if E2 then S1 else S2 has the following two
parse trees for leftmost derivation

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Two parse trees for an ambiguous sentence

The general rule is “Match each else with the closest unmatched then.This disambiguating rule can be
used directly in the grammar,

To eliminate ambiguity, the following grammar may be used:

stmt→matched | unmatchedstmt
matched→if expr stmt then matched else matchedstmt | other
unmatched→ if expr then stmt | if expr then matched else unmatchedstmt

3. Eliminating left-recursion

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Because we try to generate a leftmost derivation by scanning the input from left to right, grammars of the
form A → A x may cause endless recursion.Such grammars are called left-recursive and they must be
transformed if we want to use a top-down parser.
◼ A grammar is left recursive if for a non-terminal A, there is a derivation A+ A

◼ To eliminate direct left recursion replace

1) A → A | with A’ →  A’|

2) A → A1 | A2 | ... | Am | 1 | 2 | ... | n

with
A → 1B | 2B | ... | nB
B → 1B | 2B | ... | mB | 
4. Left-factoring

Left factoring is a grammar transformation that is useful for producing a grammar suitable for predictive
parsing. When it is not clear which of two alternative productions to use to expand a non-terminal A, we
can rewrite the A-productions to defer the decision until we have seen enough of the input to make the
right choice.
◼ Consider S → if E then S else S | if E then S
◼ Which of the two productions should we use to expand non-terminal S when the next
token is if?
We can solve this problem by factoring out the common part in these rules.

A → 1 | 2 |...| n | 

becomes
A → B| 
B → 1 | 2 |...| n

Consider the grammar , G : S → iEtS | iEtSeS | a

E→b
Left factored, this grammar becomes
S → iEtSS’ | a
S’ → eS |ε
E→b

2.4 PARSING
It is the process of analyzing a continuous stream of input in order to determine its grammatical
structure with respect to a given formal grammar.
Types of parsing:

1. Top down parsing

2. Bottom up parsing

Top-down parsing : A parser can start with the start symbol and try to transform it to the
input string. Example : LL Parsers.

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Bottom-up parsing : A parser can start with input and attempt to rewrite it into the start symbol.
Example : LR Parsers.

2.5 TOP-DOWN PARSING

It can be viewed as an attempt to find a left-most derivation for an input string or an

attempt to construct a parse tree for the input starting from the root to the leaves.

Types of TOP-DOWN PARSING

1. Recursive descent parsing
2. Predictive parsing

2.5.1 RECURSIVE DESCENT PARSING

Recursive descent is a top-down parsing technique that constructs the parse tree from the top and the
input is read from left to right. It uses procedures for every non-terminal entity. This parsing technique
recursively parses the input to make a parse tree, which may or may not require back-tracking. But the
grammar associated with it (if not left factored) cannot avoid back-tracking. A form of recursive-
descent parsing that does not require any back-tracking is known as predictive parsing.
This parsing technique is regarded recursive as it uses context-free grammar which is recursive in
nature.
This parsing method may involve backtracking.
Example for :backtracking
Consider the grammar G : S → cAd
A→ab|a
and the input string w=cad.
The parse tree can be constructed using the following top-down approach :
Step1:
Initially create a tree with single node labeled S. An input pointer points to ‘c’, the first symbol of w.
Expand the tree with the production of S.

Step2:
The leftmost leaf ‘c’ matches the first symbol of w, so advance the input pointer to the second symbol
of w ‘a’ and consider the next leaf ‘A’. Expand A using the first alternative.
JSVG Krishna, Associate Professor.
Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Step3:
The second symbol ‘a’ of w also matches with second leaf of tree. So advance the input pointer to third
symbol of w ‘d’. But the third leaf of tree is b which does not match with the input symbol d. Hence
discard the chosen production and reset the pointer to second backtracking.
Step4: Now try the second alternative for A.

Now we can halt and announce the successful completion of parsing.

2.5.2 Predictive parsing

It is possible to build a nonrecursive predictive parser by maintaining a stack explicitly, rather than
implicitly via recursive calls. The key problem during predictive parsing is that of determining the
production to be applied for a nonterminal . The nonrecursive parser in figure looks up the production to
be applied in parsing table. In what follows, we shall see how the table can be constructed directly from
certain grammars.

A table-driven predictive parser has an input buffer, a stack, a parsing table, and an output
stream. The input buffer contains the string to be parsed, followed by $, a symbol used as a right
endmarker to indicate the end of the input string. The stack contains a sequence of grammar symbols
JSVG Krishna, Associate Professor.
Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

with $ on the bottom, indicating the bottom of the stack. Initially, the stack contains the start symbol of
the grammar on top of $. The parsing table is a two dimensional array M[A,a] where A is a nonterminal,
and a is a terminal or the symbol $. The parser is controlled by a program that behaves as follows. The
program considers X, the symbol on the top of the stack, and a, the current input symbol. These two
symbols determine the action of the parser. There are three possibilities.

1 If X= a=$, the parser halts and announces successful completion of parsing.

2 If X=a!=$, the parser pops X off the stack and advances the input pointer to the next input symbol.
3 If X is a nonterminal, the program consults entry M[X,a] of the parsing table M. This entry will be
either an X-production of the grammar or an error entry. If, for example, M[X,a]={X- >UVW}, the
parser replaces X on top of the stack by WVU( with U on top). As output, we shall assume that the
parser just prints the production used; any other code could be executed here. If M[X,a]=error, the
parser calls an error recovery routine

Implementation of predictive parser:

1. Elimination of left recursion, left factoring and ambiguous grammar.
2. Construct FIRST() and FOLLOW() for all non-terminals.
3. Construct predictive parsing table.
4. Parse the given input string using stack and parsing table

Algorithm for Nonrecursive predictive parsing.

Input. A string w and a parsing table M for grammar G.

Output. If w is in L(G), a leftmost derivation of w; otherwise, an error indication.

Method. Initially, the parser is in a configuration in which it has $S on the stack with S, the start symbol
of G on top, and w$ in the input buffer. The program that utilizes the predictive parsing table M to
produce a parse for the input is shown in Fig.

set ip to point to the first symbol of w$. repeat

let X be the top stack symbol and a the symbol pointed to by ip. if X is a terminal of $ then
if X=a then
pop X from the stack and advance ip else error()
else
if M[X,a]=X->Y1Y2...Yk then begin pop X from the stack;
push Yk,Yk-1...Y1 onto the stack, with Y1 on top; output the production X-> Y1Y2...Yk
end
else error()
until X=$ /* stack is empty */

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

2.5.3 FIRST AND FOLLOW

The construction of a predictive parsing table is aided by two functions associated with a grammar:
1. FIRST
2. FOLLOW

To compute FIRST(X) for all grammar symbols X, apply the following rules until no more
terminals or e can be added to any FIRST set.

Rules for FIRST ( ):

1. If X is terminal, then FIRST(X) is {X}.
2. If X → ε is a production, then add ε to FIRST(X).
3. If X is non-terminal and X → aα is a production then add a to FIRST(X).
4. If X is non-terminal and X → Y1 Y2…Yk is a production, then place a in FIRST(X) if for
some i, a is in FIRST(Yi), and ε is in all of FIRST(Y1),…,FIRST(Yi-1);that is,
Y1,….Yi-1=> ε. If ε is in FIRST(Yj) for all j=1,2,..,k, then add ε to FIRST(X).

Rules for FOLLOW ( ):

1. If S is a start symbol, then FOLLOW(S) contains $.
2. If there is a production A → αBβ, then everything in FIRST(β) except ε is placed in
follow(B).
3. If there is a production A → αB, or a production A → αBβ where FIRST(β) contains
ε, then everything in FOLLOW(A) is in FOLLOW(B).

2.5.4 Algorithm for construction of predictive parsing table:

Input : Grammar G
Output : Parsing table M
Method :
1. For each production A → α of the grammar, do steps 2 and 3.
2. For each terminal a in FIRST(α), add A → α to M[A, a].
3. If ε is in FIRST(α), add A → α to M[A, b] for each terminal b in FOLLOW(A). If ε is in
FIRST(α) and $ is in FOLLOW(A) , add A → α to M[A, $].
4. Make each undefined entry of M be error.

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Example:
Consider the following grammar :
E→E+T|T
T→T*F|F
F→(E)|id
After eliminating left-recursion the grammar is
E →TE’
E’ → +TE’ | ε
T →FT’
T’ → *FT’ | ε
F → (E)|id

First( ) :
FIRST(E) = { ( , id}
FIRST(E’) ={+ , ε }
FIRST(T) = { ( , id}
FIRST(T’) = {*, ε }
FIRST(F) = { ( , id }

Follow( ):
FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, $, ) }
FOLLOW(T’) = { +, $, ) }
FOLLOW(F) = {+, * , $ , ) }

Predictive parsing Table

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Stack Implementation

- 24 -

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

2.5.5.LL(1) GRAMMAR
The parsing table algorithm can be applied to any grammar G to produce a parsing table M.
For some Grammars, for example if G is left recursive or ambiguous, then M will have at
least one multiply-defined entry. A grammar whose parsing table has no multiply defined
entries is said to be LL(1). It can be shown that the above algorithm can be used to produce
for every LL(1) grammar G, a parsing table M that parses all and only the sentences of G.
LL(1) grammars have several distinctive properties. No ambiguous or left recursive grammar
can be LL(1). There remains a question of what should be done in case of multiply defined
entries. One easy solution is to eliminate all left recursion and left factoring, hoping to
produce a grammar which will produce no multiply defined entries in the parse tables.
Unfortunately there are some grammars which will give an LL(1) grammar after any kind of
alteration. In general, there are no universal rules to convert multiply defined entries into
single valued entries without affecting the language recognized by the parser.
The main difficulty in using predictive parsing is in writing a grammar for the source
language such that a predictive parser can be constructed from the grammar. Although left
recursion elimination and left factoring are easy to do, they make the resulting grammar hard
to read and difficult to use the translation purposes. To alleviate some of this difficulty, a
common organization for a parser in a compiler is to use a predictive parser for control
constructs and to use operator precedence for expressions.however, if an lr parser generator
is available, one can get all the benefits of predictive parsing and operator precedence
automatically.

2.5.6.ERROR RECOVERY IN PREDICTIVE PARSING

The stack of a nonrecursive predictive parser makes explicit the terminals and nonterminals
that the parser hopes to match with the remainder of the input. We shall therefore refer to
symbols on the parser stack in the following discussion. An error is detected during
predictive parsing when the terminal on top of the stack does not match the next input
symbol or when nonterminal A is on top of the stack, a is the next input symbol, and the
parsing table entry M[A,a] is empty.

Consider error recovery predictive parsing using the following two methods
Panic-Mode recovery
Phrase Level recovery

Panic-mode error recovery is based on the idea of skipping symbols on the input until a
token in a selected set of synchronizing tokens appears. Its effectiveness depends on the
choice of synchronizing set. The sets should be chosen so that the parser recovers quickly
from errors that are likely to occur in practice.
As a starting point, we can place all symbols in FOLLOW(A) into the synchronizing set for
nonterminal A. If we skip tokens until an element of FOLLOW(A) is seen and pop A from
the stack, it is likely that parsing can continue.
It is not enough to use FOLLOW(A) as the synchronizingset for A. Fo example , if
semicolons terminate statements, as in C, then keywords that begin statements may not
appear in the FOLLOW set of the nonterminal generating expressions. A missing semicolon
after an assignment may therefore result in the keyword beginning the next statement being
skipped. Often, there is a hierarchica structure on constructs in a language; e.g., expressions
appear within statement, which appear within bblocks,and so on. We can add to the

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

synchronizing set of a lower construct the symbols that begin higher constructs. For
example, we might add keywords that begin statements to the synchronizing sets for the
nonterminals generaitn expressions.

If we add symbols in FIRST(A) to the synchronizing set for nonterminal A, then it

may be possible to resume parsing according to A if a symbol in FIRST(A) appears in the
input.

If a nonterminal can generate the empty string, then the production deriving e can be
used as a default. Doing so may postpone some error detection, but cannot cause an error
to be missed. This approach reduces the number of nonterminals that have to be considered
during error recovery.

If a terminal on top of the stack cannot be matched, a simple idea is to pop the
terminal, issue a message saying that the terminal was inserted, and continue parsing. In
effect, this approach takes the synchronizing set of a token to consist of all other tokens.

Phrase Level recovery

This involves, defining the blank entries in the table with pointers to some error routines
which may
Change, delete or insert symbols in the input or
May also pop symbols from the stack

2.6 BOTTOM-UP PARSING

Constructing a parse tree for an input string beginning at the leaves and going towards the root is
called bottom-up parsing. A general type of bottom-up parser is a shift-reduce parser.

2.6.1 SHIFT-REDUCE PARSING

Shift-reduce parsing is a type of bottom-up parsing that attempts to construct a parse tree for
an input string beginning at the leaves (the bottom) and working up towards the root (the
top).
Example:
Consider the grammar: S → aABe
A → Abc | b
B→d
The sentence to be recognized is abbcde.

REDUCTION (LEFTMOST) RIGHTMOST DERIVATION

abbcde (A → b) S → aABe
aAbcde(A → Abc) → aAde
aAde (B → d) → aAbcde
aABe (S → aABe) → abbcde
S
The reductions trace out the right-most derivation in reverse.

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Handles: A handle of a string is a substring that matches the right side of a production, and
whose reduction to the non-terminal on the left side of the production represents one step
along the reverse of a rightmost derivation.
Example:
Consider the grammar:
E→E+E
E→E*E
E→(E)
E→id

And the input string id1+id2*id3

The rightmost derivation is :

E→E+E
→ E+E*E
→ E+E*id3
→ E+id2*id3
→ id1+id2*
In the above derivation the underlined substrings are called handles.

Handle pruning:
A rightmost derivation in reverse can be obtained by “handle pruning”. (i.e.) if w is a sentence
or string of the grammar at hand, then w = γn, where γn is the nth right sentential form of
some rightmost derivation.
Actions in shift-reduce parser:
• shift - The next input symbol is shifted onto the top of the stack.
• reduce - The parser replaces the handle within a stack with a non-terminal.
• accept - The parser announces successful completion of parsing.
• error - The parser discovers that a syntax error has occurred and calls an error recovery routine.

Conflicts in shift-reduce parsing:

There are two conflicts that occur in shift-reduce parsing:
1. Shift-reduce conflict: The parser cannot decide whether to shift or to reduce.
2. Reduce-reduce conflict: The parser cannot decide which of several reductions to make.

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Stack implementation of shift-reduce parsing :

1. Shift-reduce conflict:
Example:
Consider the grammar:
E→E+E | E*E | id and input id+id*id

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

2. Reduce-reduce conflict:
Consider the grammar: M→R+R|R+c|R
R→c
input c+c

INTRODUCTION TO LR PARSERS
An efficient bottom-up syntax analysis technique that can be used CFG is called LR(k) parsing.
The ‘L’ is for left-to-right scanning of the input, the ‘R’ for constructing a rightmost derivation in
reverse, and the ‘k’ for the number of input symbols. When ‘k’ is omitted, it is assumed to be 1.

Advantages of LR parsing:
1. It recognizes virtually all programming language constructs for which CFG can be written.
2. It is an efficient non-backtracking shift-reduce parsing method.
3.A grammar that can be parsed using LR method is a proper superset of a grammar that
can be parsed with predictive parser
4.It detects a syntactic error as soon as possible.

Drawbacks of LR method:
It is too much of work to construct a LR parser by hand for a programming language grammar.
A specialized tool, called a LR parser generator, is needed. Example: YACC.
Types of LR parsing method:
1. SLR- Simple LR
Easiest to implement, least powerful.
2. CLR- Canonical LR
Most powerful, most expensive.
3. LALR- Look-Ahead LR
Intermediate in size and cost between the other two methods.

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

The LR parsing algorithm:

The schematic form of an LR parser is as follows:

It consists of an input, an output, a stack, a driver program, and a pa parts (action and goto).

a. The driver program is the same for all LR parser.

b. The parsing program reads characters from an input buffer one at a time.
c. The program uses a stack to store a string of the form s0X1s1X2s2…Xmsm, where sm is
on top. Each Xi is a grammar symbol and each si is a state.
d. The parsing table consists of two parts : action and goto functions.

Action : The parsing program determines sm, the state currently on top of stack, and ai, the current
input symbol. It then consults action[sm,ai] in the action table which can have one of four values:
1. shift s, where s is a state,
2. reduce by a grammar production A → β,
3. accept,
4. Error.
Goto : The function goto takes a state and grammar symbol as arguments and produces a state.

LR Parsing algorithm:
Input: An input string w and an LR parsing table with functions action and goto for grammar G.
Output: If w is in L(G), a bottom-up-parse for w; otherwise, an error indication.
Method: Initially, the parser has s0 on its stack, where s0 is the initial state, and w$ in the input buffer.

The parser then executes the following program:

JSVG Krishna, Associate Professor.
Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

set ip to point to the first input symbol of w$; repeat forever begin
let s be the state on top of the stack and a the symbol pointed to by ip;
if action[s, a] = shift s’ then begin
push a then s’ on top of the stack; advance ip to the next input symbol end
else if action[s, a] = reduce A→β then begin pop 2* | β | symbols off the stack;
let s’ be the state now on top of the stack; push A then goto[s’, A] on top of the stack; output the
production A→ β
end
else if action[s, a] = accept then
return
else error( )
end

CONSTRUCTING SLR(1) PARSING TABLE

To perform SLR parsing, take grammar as input and do the following:
1. Find LR(0) items.
2. Completing the closure.
3. Compute goto(I,X), where, I is set of items and X is grammar symbol.
LR(0) items:

An LR(0) item of a grammar G is a production of G with a dot at some position of the right side. For
example, production A → XYZ yields the four items :
A→.XYZ
A → X . YZ
A → XY . Z
A → XYZ .

Closure operation:
If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I by
the two rules:
1. Initially, every item in I is added to closure(I).
2. If A → α . Bβ is in closure(I) and B → γ is a production, then add the item B → . γ to I , if it
is not already there. We apply this rule until no more new items can be added to closure(I).

Goto operation:

Goto(I, X) is defined to be the closure of the set of all items [A→ αX . β] such that [A→ α . Xβ] is in I.
Steps to construct SLR parsing table for grammar G are:
1. Augment G and produce G’
2. Construct the canonical collection of set of items C for G’
3. Construct the parsing action function action and goto using the following algorithm that requires
FOLLOW(A) for each non-terminal of grammar.

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Algorithm for construction of SLR parsing table:
Input : An augmented grammar G’
Output : The SLR parsing table functions action and goto for G’
Method :
1. Construct C = {I0, I1, …. In}, the collection of sets of LR(0) items for G’.
2. State i is constructed from Ii.. The parsing functions for state i are determined as
follows:
(a) If [A→α∙aβ] is in Ii and goto(Ii,a) = Ij, then set action[i,a] to “shift j”. Here a must be
terminal.
(b) If [A→α∙] is in Ii , then set action[i,a] to “reduce A→α” for all a in FOLLOW(A).
(c) If [S’→S.] is in Ii, then set action[i,$] to “accept”.
If any conflicting actions are generated by the above rules, we say grammar is not SLR(1).
3. The goto transitions for state i are constructed for all non-term
If goto(Ii,A) = Ij, then goto[i,A] = j.
4. All entries not defined by rules (2) and (3) are made “error”
5. The initial state of the parser is the one constructed from the [S’→.S].

Example on SLR ( 1 ) Grammar

S→E
E→E+T|T
T→T*F|F
F → id

Add Augment Production and insert '•' symbol at the first position for every production in G

S` → •E
E → •E + T
E → •T
T → •T * F
T → •F
F → •id

I0 State:

Add Augment production to the I0 State and Compute the Closure

I0 = Closure (S` → •E)

Add all productions starting with E in to I0 State because "." is followed by the non-terminal. So, the I0
State becomes

I0 = S` → •E
E → •E + T
E → •T

Add all productions starting with T and F in modified I0 State because "." is followed by the non-
terminal. So, the I0 State becomes.
JSVG Krishna, Associate Professor.
Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

I0= S` → •E
E → •E + T
E → •T
T → •T * F
T → •F
F → •id

I1= Go to (I0, E) = closure (S` → E•, E → E• + T)

I2= Go to (I0, T) = closure (E → T•T, T• → * F)
I3= Go to (I0, F) = Closure ( T → F• ) = T → F•
I4= Go to (I0, id) = closure ( F → id•) = F → id•
I5= Go to (I1, +) = Closure (E → E +•T)

Add all productions starting with T and F in I5 State because "." is followed by the non-terminal. So, the
I5 State becomes

I5 = E → E +•T
T → •T * F
T → •F
F → •id

Go to (I5, F) = Closure (T → F•) = (same as I3)

Go to (I5, id) = Closure (F → id•) = (same as I4)

I6= Go to (I2, ) = Closure (T → T •F)

Add all productions starting with F in I6 State because "." is followed by the non-terminal. So, the I6
State becomes

I6 = T → T * •F
F → •id

Go to (I6, id) = Closure (F → id•) = (same as I4)

I7= Go to (I5, T) = Closure (E → E + T•) = E → E + T•

I8= Go to (I6, F) = Closure (T → T * F•) = T → T * F•

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

Drawing DFA

SLR (1) Table

Explanation:

First (E) = First (E + T) ∪ First (T)

First (T) = First (T * F) ∪ First (F)
First (F) = {id}
First (T) = {id}
First (E) = {id}
Follow (E) = First (+T) ∪ {$} = {+, $}
Follow (T) = First (*F) ∪ First (F)
= {*, +, $}
Follow (F) = {*, +, $}

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)
lOMoARcPSD|20951282

CSE Dept.,Sir CRR COE.

1) I1 contains the final item which drives S → E• and follow (S) = {$}, so action {I1, $} = Accept
2) I2 contains the final item which drives E → T• and follow (E) = {+, $}, so action {I2, +} = R2,
action {I2, $} = R2
3) I3 contains the final item which drives T → F• and follow (T) = {+, *, $}, so action {I3, +} =
R4, action {I3, *} = R4, action {I3, $} = R4
4) I4 contains the final item which drives F → id• and follow (F) = {+, *, $}, so action {I4, +} =
R5, action {I4, *} = R5, action {I4, $} = R5
5) I7 contains the final item which drives E → E + T• and follow (E) = {+, $}, so action {I7, +} =
R1, action {I7, $} = R1
6) I8 contains the final item which drives T → T * F• and follow (T) = {+, *, $}, so action {I8, +}
= R3, action {I8, *} = R3, action {I8, $} = R3.

- 27 -

JSVG Krishna, Associate Professor.

Downloaded by Priya Rana (prinkit2002@gmail.com)

CS3304 9 LanguageSyntax 2 PDF
No ratings yet
CS3304 9 LanguageSyntax 2 PDF
39 pages
Compiler 2
No ratings yet
Compiler 2
32 pages
Lecture05-Syntax Analysis-CFG
No ratings yet
Lecture05-Syntax Analysis-CFG
19 pages
Unit 2
No ratings yet
Unit 2
45 pages
Unit 2
No ratings yet
Unit 2
29 pages
Unit 2 Compiler
No ratings yet
Unit 2 Compiler
42 pages
CSC412 Compiler Construction I March 24 2022 NOUN-pages-3
No ratings yet
CSC412 Compiler Construction I March 24 2022 NOUN-pages-3
150 pages
Unit4 Notes
No ratings yet
Unit4 Notes
32 pages
2 SimpleOnePassCompiler
No ratings yet
2 SimpleOnePassCompiler
66 pages
3-Module 2 - Role of Parser - Parse Tree-02-08-2024
No ratings yet
3-Module 2 - Role of Parser - Parse Tree-02-08-2024
76 pages
Unit-2 PCD
No ratings yet
Unit-2 PCD
36 pages
Syntax Analysis
No ratings yet
Syntax Analysis
47 pages
compiler_design- Module3
No ratings yet
compiler_design- Module3
19 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Compiler Construction CS-4207: Instructor Name: Atif Ishaq
No ratings yet
Compiler Construction CS-4207: Instructor Name: Atif Ishaq
19 pages
Morphological Parsing
No ratings yet
Morphological Parsing
19 pages
Chapter 2 (Part 1)
No ratings yet
Chapter 2 (Part 1)
32 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
6 pages
Unit - II CD
No ratings yet
Unit - II CD
38 pages
COMPILER DESIGN UNIT 2
No ratings yet
COMPILER DESIGN UNIT 2
44 pages
Syntax Analysis
No ratings yet
Syntax Analysis
27 pages
Syntax Analysis
No ratings yet
Syntax Analysis
58 pages
Module 3
No ratings yet
Module 3
56 pages
04+syntax+analysis+(1)
No ratings yet
04+syntax+analysis+(1)
43 pages
II. Parser: Syntax Analysis
No ratings yet
II. Parser: Syntax Analysis
18 pages
Module 3 Ss and CD Lecture Notes 18cs61
No ratings yet
Module 3 Ss and CD Lecture Notes 18cs61
15 pages
Compiler Design Lecture Notes
No ratings yet
Compiler Design Lecture Notes
37 pages
Ch3_Syntax Analysis
No ratings yet
Ch3_Syntax Analysis
96 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Chapter – 3
No ratings yet
Chapter – 3
46 pages
Class Three
No ratings yet
Class Three
74 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
Compiler Design Module 2 Notes 2022-23 02-04-2023 Modified
No ratings yet
Compiler Design Module 2 Notes 2022-23 02-04-2023 Modified
46 pages
Parser
No ratings yet
Parser
4 pages
CSC 305: Programming Paradigm: Introduction To Language, Syntax and Semantics
No ratings yet
CSC 305: Programming Paradigm: Introduction To Language, Syntax and Semantics
38 pages
CD Unit-2 (R20)
No ratings yet
CD Unit-2 (R20)
38 pages
Chapter 4
No ratings yet
Chapter 4
23 pages
CC 3
No ratings yet
CC 3
29 pages
Lecture 4 PDF
No ratings yet
Lecture 4 PDF
28 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
43 pages
Ch3_Syntax Analysis
No ratings yet
Ch3_Syntax Analysis
96 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
ATCD - 21CS51 - M3 - Savitha T
No ratings yet
ATCD - 21CS51 - M3 - Savitha T
56 pages
Figure 1two Parse Trees For 9-5+2
No ratings yet
Figure 1two Parse Trees For 9-5+2
3 pages
2 Syntax Analysis - Introduction
No ratings yet
2 Syntax Analysis - Introduction
8 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
2. Simple Syntax Directed Translation
No ratings yet
2. Simple Syntax Directed Translation
51 pages
Unit Iii
No ratings yet
Unit Iii
28 pages
CD Unit 2
100% (1)
CD Unit 2
20 pages
Lecture 03
No ratings yet
Lecture 03
7 pages
Compiler Design Unit II-1
No ratings yet
Compiler Design Unit II-1
46 pages
Compiler CFG Slides of PowerPoint
No ratings yet
Compiler CFG Slides of PowerPoint
66 pages
Unit - 2
No ratings yet
Unit - 2
38 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
CS602PC - Compiler Design Lecture Notes Unit 2
No ratings yet
CS602PC - Compiler Design Lecture Notes Unit 2
42 pages
Simple and Operator Grammar
No ratings yet
Simple and Operator Grammar
19 pages
SPCC Oral Questions
No ratings yet
SPCC Oral Questions
10 pages
Wa0005.
No ratings yet
Wa0005.
42 pages
CH-3 Syntax Analyzer
No ratings yet
CH-3 Syntax Analyzer
41 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Compiler Design Unit 5
100% (1)
Compiler Design Unit 5
28 pages
Flat Module 3
No ratings yet
Flat Module 3
18 pages
CN Module4
No ratings yet
CN Module4
13 pages
FLAT Lords Paper
No ratings yet
FLAT Lords Paper
60 pages
CSE HTML and DHTML
No ratings yet
CSE HTML and DHTML
5 pages
CSE HTML Document
No ratings yet
CSE HTML Document
4 pages
CD
No ratings yet
CD
148 pages
Theory of Computation and Compiler Design: Module - 4
No ratings yet
Theory of Computation and Compiler Design: Module - 4
31 pages
Theory of Automata, Languages & Computation: June 2010
No ratings yet
Theory of Automata, Languages & Computation: June 2010
20 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
17 pages
Recursive Descent Parser
No ratings yet
Recursive Descent Parser
16 pages
Reference
No ratings yet
Reference
55 pages
Module 2
No ratings yet
Module 2
19 pages
Compiler
No ratings yet
Compiler
52 pages
Lec 05
No ratings yet
Lec 05
22 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
Ss&Os Lab Master Manual2020-21 - Ajiet
No ratings yet
Ss&Os Lab Master Manual2020-21 - Ajiet
69 pages
CD All Units
No ratings yet
CD All Units
117 pages
CSE18R274 - Compiler Design
No ratings yet
CSE18R274 - Compiler Design
117 pages
Compiler Design Material
No ratings yet
Compiler Design Material
107 pages
Course Files
100% (1)
Course Files
33 pages
Subject: Compiler Design Code: T141 Unit-I
No ratings yet
Subject: Compiler Design Code: T141 Unit-I
6 pages
Recursive Descent Parsing
No ratings yet
Recursive Descent Parsing
3 pages
Compiler Design Lab
No ratings yet
Compiler Design Lab
53 pages
10-11-12-13-Top Down Parser
No ratings yet
10-11-12-13-Top Down Parser
76 pages
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
100% (2)
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
56 pages
One Line Ques Ans1
100% (1)
One Line Ques Ans1
11 pages
Principles of Compiler Design PDF
0% (1)
Principles of Compiler Design PDF
177 pages
Session 3
No ratings yet
Session 3
18 pages
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
No ratings yet
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
46 pages
Compiler Design 1
100% (1)
Compiler Design 1
30 pages
Recursive Descent Parsers
No ratings yet
Recursive Descent Parsers
3 pages
CD Unit-Ii
No ratings yet
CD Unit-Ii
37 pages
CS606 MIDTERM SOLVED MCQS by JUNAID
No ratings yet
CS606 MIDTERM SOLVED MCQS by JUNAID
51 pages