Chapter-3-Syntax Analysis
Chapter-3-Syntax Analysis
Syntax analysis
1
Outline
Introduction
Context free grammar (CFG)
Derivation
Parse tree
Ambiguity
Left recursion
Left factoring
Top-down parsing
• Recursive Descent Parsing (RDP)
• Non-recursive predictive parsing
– First and follow sets
– Construction of a predictive parsing table
2
Outline
LR(1) grammars
Syntax error handling
Error recovery in predictive parsing
Panic mode error recovery strategy
Yacc
3
Introduction
Syntax: the way in which tokens are put together to
form expressions, statements, or blocks of statements.
The rules governing the formation of statements in a
programming language.
Syntax analysis: the task concerned with fitting a
sequence of tokens into a specified syntax.
Parsing: To break a sentence down into its component
parts with an explanation of the form, function, and
syntactical relationship of each part.
The syntax of a programming language is usually given
by the grammar rules of a context free grammar (CFG).
4
Parser
Parse tree
next char next token
lexical Syntax
analyzer analyzer
get next
char get next
token
Source
Program
symbol
table
Lexical Syntax
(Contains a record Error
Error
for each identifier)
5
Introduction…
The syntax analyzer (parser) checks whether a given
source program satisfies the rules implied by a CFG
or not.
If it satisfies, the parser creates the parse tree of that
program.
Otherwise, the parser gives the error messages.
A CFG:
gives a precise syntactic specification of a
programming language.
A grammar can be directly converted in to a parser by
some tools (yacc).
6
Introduction…
The parser can be categorized into two groups:
Top-down parser
The parse tree is created top to bottom, starting from
the root to leaves.
Bottom-up parser
The parse tree is created bottom to top, starting from
the leaves to root.
Both top-down and bottom-up parser scan the input
from left to right (one symbol at a time).
Efficient top-down and bottom-up parsers can be
implemented by making use of context-free-
grammar.
LL for top-down parsing
LR for bottom-up parsing
7
Context free grammar (CFG)
A context-free grammar is a specification for the
syntactic structure of a programming language.
Context-free grammar has 4-tuples:
G = (T, N, P, S) where
T is a finite set of terminals (a set of tokens)
N is a finite set of non-terminals (syntactic variables)
P is a finite set of productions of the form
8
Example: grammar for simple arithmetic expressions
9
Notational Conventions Used
Terminals:
Lowercase letters early in the alphabet, such as a, b, c.
Operator symbols such as +, *, and so on.
Punctuation symbols such as parentheses, comma, and so
on.
The digits 0,1,. . . ,9.
Th strings such as id or if, each of which represents a
single terminal symbol.
Non-terminals:
Uppercase letters early in the alphabet, such as A, B, C.
The letter S is usually the start symbol.
Uppercase letters may be used to represent non-terminals
for the constructs.
• expr, term, and factor are represented by E, T, F
10
Derivation
A derivation is a sequence of replacements of structure names
by choices on the right hand sides of grammar rules.
Example: E → E + E | E – E | E * E | E / E | -E
E→(E)
E → id
11
Derivation…
12
Parse tree
A parse tree is a graphical representation of a
derivation
It filters out the order in which productions are applied
to replace non-terminals.
13
Parse tree and Derivation
Grammar E E + E | E E | ( E ) | - E | id
Lets examine this derivation:
E -E -(E) -(E + E) -(id + id)
E E E E E
- E - E - E - E
( E ) ( E ) ( E )
E + E E + E
This is a top-down derivation
because we start building the id id
parse tree at the top parse tree
14
Exercise
a) Using the grammar below, draw a parse tree for the
following string:
( ( id . id ) id ( id ) ( ( ) ) )
S→E
E → id
|(E.E)
|(L)
|()
L→LE
|E
b) Give a rightmost derivation for the string given in (a).
15
Ambiguity
A grammar produces more than one parse tree for a
sentence is called as an ambiguous grammar.
• produces more than one leftmost derivation or
• more than one rightmost derivation for the same
sentence.
16
Ambiguity: Example
Example: The arithmetic expression grammar
E → E + E | E * E | ( E ) | id
permits two distinct leftmost derivations for the
sentence id + id * id:
(a) (b)
E => E + E E => E * E
=> id + E => E + E * E
=> id + E * E => id + E * E
=> id + id * E => id + id * E
=> id + id * id => id + id * id
17
Ambiguity: example
E E + E | E E | ( E ) | - E | id
Construct parse tree for the expression: id + id id
E E E E
E + E E + E E + E
E E id E E
id id
E E E E
E E E E E E
E + E E + E id
Which parse tree is correct?
id id
18
Ambiguity: example…
E E + E | E E | ( E ) | - E | id
id E E
A grammar that produces more than one
id id
parse tree for any input sentence is said
to be an ambiguous grammar. E
E + E
E E id
id id
19
Elimination of ambiguity
To disambiguate the grammar :
E E + E | E E | ( E ) | id
id + id * id
EE+T|T
TTF|F
F ( E ) | id
20
Left Recursion
EE+T|T
Consider the grammar: TTF|F
F ( E ) | id
21
Elimination of Left recursion
A grammar is left recursive, if it has a non-terminal A
such that there is a derivation
A=>Aα for some string α.
Top-down parsing methods cannot handle left-recursive
grammar.
so a transformation that eliminates left-recursion is
needed.
To eliminate left recursion for single production
A Aα |β could be replaced by the nonleft- recursive
productions
A β A’
A’ α A’| ε
22
Elimination of Left recursion…
E TE’
E’ +TE’ |
T FT’
T’ FT’ |
F ( E ) | id
23
Elimination of Left recursion…
Generally, we can eliminate immediate left
recursion from them by the following technique.
First we group the A-productions as:
24
Left factoring
When a non-terminal has two or more productions
whose right-hand sides start with the same grammar
symbols,(common prefix) the grammar is not LL(1) and
cannot be used for predictive parsing
A predictive parser (a top-down parser without
backtracking) insists that the grammar must be left-
factored.
25
Left factoring…
When processing α we do not know whether to expand A
to αβ1 or to αβ2, but if we re-write the grammar as
follows:
A αA’
A’ β1 | β2 so, we can immediately expand A to αA’.
S iEtS | iEtSeS | a
Eb
Left factored, this grammar becomes:
S iEtSS’ | a
S’ eS | ε 26
Syntax analysis
Every language has rules that prescribe the syntactic
structure of well formed programs.
The syntax can be described using Context Free
Grammars (CFG) notation.
29
RDP…
Example: G: S cAd
A ab|a
Draw the parse tree for the input string cad using
the above method.
30
Exercise
Using the grammar below, draw a parse tree for the
following string using RDP algorithm:
( ( id . id ) id ( id ) ( ( ) ) )
S→E
E → id
|(E.E)
|(L)
|()
L→LE
|E
31
Non-recursive predictive parsing
It is possible to build a non-recursive parser by explicitly
maintaining a stack.
This method uses a parsing table that determines the
next production to be applied.
x=a=$ OUTPUT:
INPUT: id + id id $
x=a≠$
X is non-terminal E
T E’
Predictive Parsing
STACK: E
$
E Program
$
PARSING
TABLE: NO N-
TE R M IN A id +
IN PU T S Y M B O L
* ( )
L
E E E
32
Non-recursive predictive parsing…
The input buffer contains the string to be parsed
followed by $ (the right end marker)
The stack contains a sequence of grammar symbols
with $ at the bottom.
Initially, the stack contains the start symbol of the
grammar followed by $.
The parsing table is a two dimensional array M[A, a]
where A is a non-terminal of the grammar and a is a
terminal or $.
The parser program behaves as follows.
The program always considers
X, the symbol on top of the stack and
a, the current input symbol.
33
Predictive Parsing…
There are three possibilities:
1. x = a = $ : the parser halts and announces a successful
completion of parsing
2. x = a ≠ $ : the parser pops x off the stack and advances
the input pointer to the next symbol
3. X is a non-terminal : the program consults entry M[X, a]
which can be an X-production or an error entry.
If M[X, a] = {X uvw}, X on top of the stack will be
replaced by uvw (u at the top of the stack).
As an output, any code associated with the X-production
can be executed.
If M[X, a] = error, the parser calls the error recovery
method.
34
Predictive Parsing algorithm
set ip to point to the first symbol of w;
set X to the top stack symbol;
while ( X ≠ $ ) { /* stack is not empty */
if ( X is a ) pop the stack and advance ip;
else if ( X is a terminal ) error();
else if ( M[X, a] is an error entry ) error();
else if ( M[X,a] = X Y1Y2 … Yk ) {
output the production X Y1Y2 … Yk;
pop the stack;
push Yk, Yk-1,. . . , Y1 onto the stack, with Y1 on top;
}
set X to the top stack symbol;
}
35
A Predictive Parser table
E TE’
E’ +TE’ |
T FT’
Grammar: T’ FT’ |
F ( E ) | id
36
Predictive Parsing Simulation
INPUT: id + id id $ OUTPUT:
E
T E’
Predictive Parsing
STACK: T
E
E’
$ Program
$
PARSING NON-
TERMINAL id +
INPUT SYMBOL
* ( ) $
TABLE: E E TE’ E TE’
E’ E’ +TE’ E’ E’
T T FT’ T FT’
T’ T’ T’ *FT’ T’ T’
F F id F (E) 37
Predictive Parsing Simulation…
INPUT: id + id id $ OUTPUT:
E
T E’
Predictive Parsing
STACK: T
F
Program F T’
T’
E’
E’
$
$
INPUT: id + id id $ OUTPUT:
E
T E’
Predictive Parsing
STACK: id
T
F
Program F T’
T’
E’
E’
$ id
$
INPUT: id + id id $ OUTPUT:
E
T E’
Predictive Parsing
STACK: T’
E’
Program F T’
E’
$
$ id
T FT’ id F T’
F id
id F T’
T’ FT’
F id id
T’ When Top(Stack) = input = $
E’ the parser halts and accepts the
input string.
41
Non-recursive predictive parsing…
Example: G:
E TR
R +TR Input: 1+2
R -TR
Rε
T 0|1|…|9
X|a 0 1 … 9 + - $
43
FIRST and FOLLOW
44
Construction of a predictive parsing table
FIRST
First(α) = set of terminals that begin the strings
derived from α.
If α => ε in zero or more steps, ε is in first(α).
First(X) where X is a grammar symbol can be found
using the following rules:
1- If X is a terminal, then first(x) = {x}
2- If X is a non-terminal: two cases
45
Construction of a predictive parsing table…
FOLLOW
Follow(A) = set of terminals that can appear
immediately to the right of A in some sentential
form.
1- Place $ in follow(A), where A is the start symbol.
48
Exercise
Consider the following grammar over the alphabet
{ g,h,i,b}
A BCD
B bB | ε
C Cg | g | Ch | i
D AB | ε
Fill in the table below with the FIRST and FOLLOW sets for
the non-terminals in this grammar:
FIRST FOLLOW
A
B
C
D
49
Construction of predictive parsing table
Input Grammar G
Output Parsing table M
For each production of the form A α of the
grammar do:
• For each terminal a in first(α), add A α to
M[A, a]
• If ε Є first(α), add A α to M[A, b] for each b
in follow(A)
• If ε Є first(α) and $ Є follow(A), add A α to
M[A, $]
• Make each undefined entry of M be an error.
50
Example:
51
Non-recursive predictive parsing…
Exercise 1:
Consider the following grammars G, Construct the
predictive parsing table and parse the input symbols:
id + id * id
FIRST(E)=FIRST(T)=FIRST(F)={(,id}
E TE’ FIRST(E’)={+,ε}
E’ +TE’ | FIRST(T’)={*,ε}
T FT’
T’ FT’ | FOLLOW(E)=FOLLOW(E’)={$,)}
F ( E ) | id FOLLOW(T)=FOLLOW(T’)={+,$,)}
FOLLOW(F)={*,+,$,)}
53
Exercises
54
Exercises
55
Exercises
3. Given the following grammar:
program procedure STMT–LIST
STMT–LIST STMT STMT–LIST | STMT
STMT do VAR = CONST to CONST begin STMT–LIST end
| ASSN–STMT
Show the parse tree for the following code fragment:
procedure
do i=1 to 100 begin
ASSN –STMT
ASSN-STMT
end
ASSN-STMT
56
Exercises
57
Bottom-Up and Top-Down
Parsers
Top-down parsers:
Starts constructing the parse tree at the top (root) of the
tree and move down towards the leaves.
• Easy to implement by hand, but work with restricted
grammars.
example: predictive parsers
Bottom-up parsers:
• build the nodes on the bottom of the parse tree first.
• Suitable for automatic parser generation, handle a larger
class of grammars.
examples: shift-reduce parser (or LR(k) parsers)
58
Bottom-Up Parser
A bottom-up parser, or a shift-reduce parser, begins
at the leaves and works up to the top of the tree.
S aABe
Consider the Grammar: A Abc | b
B d
59
Bottom-Up Parser: Simulation
INPUT: a b b c d e $ OUTPUT:
Production
S aABe
Bottom-Up Parsing
A Abc
Program
Ab
Bd
60
Bottom-Up Parser: Simulation
INPUT: a b b c d e $ OUTPUT:
Production
S aABe
Bottom-Up Parsing
A Abc Program A
Ab
Bd b
61
Bottom-Up Parser: Simulation
INPUT: a A b c d e $ OUTPUT:
Production
S aABe
Bottom-Up Parsing
A Abc Program A
Ab
Bd b
62
Bottom-Up Parser: Simulation
INPUT: a A b c d e $ OUTPUT:
Production
S aABe
Bottom-Up Parsing
A Abc Program A
Ab
Bd b
INPUT: a A b c d e $ OUTPUT:
Production
A
S aABe
Bottom-Up Parsing
A Abc Program A b c
Ab
Bd b
64
Bottom-Up Parser: Simulation
INPUT: a A d e $ OUTPUT:
Production
A
S aABe
Bottom-Up Parsing
A Abc Program A b c
Ab
Bd b
65
Bottom-Up Parser: Simulation
INPUT: a A d e $ OUTPUT:
Production
A B
S aABe
Bottom-Up Parsing
A Abc Program A b c d
Ab
Bd b
66
Bottom-Up Parser: Simulation
INPUT: a A B e $ OUTPUT:
Production
A B
S aABe
Bottom-Up Parsing
A Abc Program A b c d
Ab
Bd b
67
Bottom-Up Parser: Simulation
INPUT: a A B e $ OUTPUT:
S
Production e
a A B
S aABe
Bottom-Up Parsing
A Abc Program A b c d
Ab
Bd b
68
Bottom-Up Parser: Simulation
INPUT: S $ OUTPUT:
S
Production e
a A B
S aABe
Bottom-Up Parsing
A Abc Program A b c d
Ab
Bd b
70
Stack implementation of shift/reduce
parsing
In LR parsing the two major problems are:
locate the substring that is to be reduced
locate the production to use
71
Stack implementation of shift/reduce parsing…
72
Example: An example of the operations of a
shift/reduce parser
G: E E + E | E*E | (E) | id
73
Conflict during shift/reduce parsing
Grammars for which we can construct an LR(k)
parsing table are called LR(k) grammars.
Most of the grammars that are used in practice are
LR(1).
There are two types of conflicts in shift/reduce
parsing:
shift/reduce conflict: when we have a situation
where the parser knows the entire stack content and
the next k symbols but cannot decide whether it
should shift or reduce. Ambiguity
reduce/reduce conflict: when the parser cannot
decide which of the several productions it should use
for a reduction.
ET
E id with an id on the top of stack
T id 74
LR parser
Stack input a1 … ai … an $
Sm
Xm
Sm-1
LR Output
Parsing program
Xm-1
…
S0
ACTION GOTO
$
75
LR parser…
The LR(k) stack stores strings of the form: S0X0S1X1…
XmSm where
• Si is a new symbol called state that summarizes the
information contained in the stack
• Sm is the state on top of the stack
• Xi is a grammar symbol
The parser program decides the next step by using:
• the top of the stack (Sm),
• the input symbol (ai), and
• the parsing table which has two parts: action and goto.
• then consulting the entry ACTION[Sm , ai] in the parsing
action table
76
Structure of the LR Parsing Table
The parsing table consists of two parts:
• a parsing-action function ACTION and
• a goto function GOTO.
The ACTION function takes as arguments a state i and a
terminal a (or $, the input endmarker).
The value of ACTION[i, a] can have one of four forms:
Shift j, where j is a state. The action taken by the parser shifts
input a on the top of the stack, but uses state j to represent a.
Reduce A β, The action of the parser reduces β on the top of the
stack to head A.
Accept, The parser accepts the input and finishes parsing.
Error, The parser discovers an error
GOTO function, defined on sets of items, to states.
GOTO[Ii, A] = Ij, then GOTO maps a state i and a non-terminal A to
state j. 77
LR parser configuration
Behavior of an LR parser describe the complete state
of the parser.
A configuration of an LR parser is a pair:
(S0 X1 S1 X2 S2… Xm Sm , ai ai+1 … an $)
inputs
stack
This configuration represents the right-sentential form
(X1 X2 … Xm , ai ai+1,…, an $)
79
Behavior of LR parser…
2. Action[Sm, ai] = reduce A β: the parser pops the first 2r
symbols off the stack, where r = |β| (at this point, Sm-r will
be the state on top of the stack), entering the
configuration,
(S0 X1 S1 X2 S2 … Xm-r Sm-r A S, ai ai+1 … an $)
80
LR-parsing algorithm.
let a be the first symbol of w$;
while(1) { /* repeat forever */
let s be the state on top of the stack;
if ( ACTION[S, a] = shift t ) {
push t onto the stack;
let a be the next input symbol;
} else if ( ACTION[S, a] = reduce A β ) {
pop IβI symbols off the stack;
let state t now be on top of the stack;
push GOTO[t, A] onto the stack;
output the production A β;
} else if ( ACTION[S, a] = accept ) break; /* parsing is done */
else call error-recovery routine;
}
81
LR parser…
82
State action goto
id + * ( ) $ E T F
0 S5 S4 1 2 3
1 S6 accept
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S5 S4 8 2 3
5 R6 R6 R6 R6
6 S5 S4 9 3
7 S5 S4 10
8 S6 S11
Legend: Si means shift to state i,
9 R1 reduce
Rj means S7 R1 byR1j
production 83
LR parser…
Example: The following example shows how a shift/reduce parser
parses an input string w = id * id + id using the parsing table shown
above.
3-84
LR Parser: Simulation
Input
S
t
LR Parsing
a Output
Program
c
k
action goto
85
LR Parser: Simulation
86
GRAMMAR:
(1) E E+T
(2)
(3)
ET
T TF
LR Parser: Simulation
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: E
0
Program
LR Parsing
STACK: E
5
Program
id
0
State action goto
id + * ( ) $ E T F F
0 s5 s4 1 2 3
1 s6 acc
id
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
88
11 r5 r5 r5 r5
GRAMMAR:
(1)
(2)
E E+T
ET
LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: 0
Program
LR Parsing
STACK: E
3
Program
F
0 T
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 90
GRAMMAR:
(1)
(2)
E E+T
ET LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: 0
Program
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 91
GRAMMAR:
(1)
(2)
E E+T
ET LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: E
2
Program
T
0 T
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 92
GRAMMAR:
(1)
(2)
E
E
E+T
T LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: E
7
Program
2 T
T
0 F
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 93
GRAMMAR:
(1)
(2)
E
E
E+T
T LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: E
5
Program
id
7 T F
2 F id
T
0 id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 94
GRAMMAR:
(1)
(2)
E
E
E+T
T LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: E
7
Program
2 T F
T
0 F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 95
GRAMMAR:
(1)
(2)
E
E
E+T
T
LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: 10
E T
Program
F
7 T F
2 F id
T
0 id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 96
GRAMMAR:
(1)
(2)
E E+T
ET LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
LR Parsing
STACK: 0 T
Program
T F
F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 97
GRAMMAR:
(1)
(2)
E E+T
ET
LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
E
LR Parsing
STACK: 2 T
Program
T
0 T F
F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 98
GRAMMAR:
(1)
(2)
E E+T
ET
LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
E
LR Parsing
STACK: 0 T
Program
T F
F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 99
GRAMMAR:
(1)
(2)
E E+T
E’ T
LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
E
LR Parsing
STACK: 1 T
Program
E
0 T F
F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 100
GRAMMAR:
(1)
(2)
E
E
E+T
T
LR Parser: Simulation
(3) T TF
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
E
LR Parsing
STACK: 6 T
Program
+
1 T F
E
0 F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 101
GRAMMAR:
(1) E E+T
(2)
(3)
E
T
T
TF
LR Parser: Simulation
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
E
LR Parsing
STACK: 5 T F
Program
id
6 T F id
+
1 F id
E
0 id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 102
GRAMMAR:
(1) E E+T
(2)
(3)
E
T
T
TF
LR Parser: Simulation
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
E
LR Parsing
STACK: 6 T F
Program
+
1 T F id
E
0 F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 103
GRAMMAR:
(1) E E+T
(2)
(3)
E
T
T
TF
LR Parser: Simulation
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
E T
LR Parsing
STACK: 3 T F
Program
F
6 T F id
+
1 F id
E
0 id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 104
GRAMMAR:
(1) E E+T
(2)
(3)
E
T
T
TF
LR Parser: Simulation
OUTPUT:
(4) T F
(5) F (E)
INPUT: id id + id $
(6) F id
E
LR Parsing
STACK: 6 T F
Program
+
1 T F id
E
0 F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 105
GRAMMAR:
(1) E E+T
(2)
(3)
E
T
T
TF
LR Parser: Simulation
OUTPUT:
(4) T F
(5) F (E) E
INPUT: id id + id $
(6) F id
E + T
LR Parsing
STACK: 9 T F
Program
T
6 T F id
+
1 F id
E
0 id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 106
GRAMMAR:
(1) E E+T
(2)
(3)
ET
T TF
LR Parser: Simulation
OUTPUT:
(4) T F
(5) F (E) E
INPUT: id id + id $
(6) F id
E + T
LR Parsing
STACK: 0 T F
Program
T F id
F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 107
GRAMMAR:
(1) E E+T
(2)
(3)
E
T
T
TF
LR Parser: Simulation
OUTPUT:
(4) T F
(5) F (E) E
INPUT: id id + id $
(6) F id
E + T
LR Parsing
STACK: 1 T F
Program
E
0 T F id
F id
id
S tat action
e id + * ( ) $
0 s5 s4
1 s6 a cc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4
5 r6 r6 r6 r6
6 s5 s4
7 s5 s4 108
Constructing SLR parsing tables
This method is the simplest of the three methods
used to construct an LR parsing table.
It is called SLR (simple LR) because it is the
easiest to implement.
However, it is also the weakest in terms of the
number of grammars for which it succeeds.
A parsing table constructed by this method is
called SLR table.
A grammar for which an SLR table can be
constructed is said to be an SLR grammar.
109
Constructing SLR parsing tables…
LR (0) item
An LR (0) item (item for short) is a production of a
grammar G with a dot at some position of the right
side.
For example for the production A X Y Z we have
four items:
A.XYZ
AX.YZ
AXY.Z
A X Y Z.
For the production A ε we only have one item:
A .
110
Constructing SLR parsing tables…
An item indicates what is the part of a production that
we have seen and what we hope to see.
The central idea in the SLR method is to construct,
from the grammar, a deterministic finite automaton to
recognize viable prefixes.
A viable prefix is a prefix of a right sentential form
that can appear on the stack of a shift/reduce parser.
• If you have a viable prefix in the stack it is possible
to have inputs that will reduce to the start symbol.
• If you don’t have a viable prefix on top of the stack
you can never reach the start symbol; therefore you
have to call the error recovery procedure.
111
Constructing SLR parsing tables…
The closure operation
112
Constructing SLR parsing tables…
Example G1’:
E’ E
EE+T
ET
TT*F
TF
F (E)
F id
I = {[E’ .E]}
Closure (I) = {[E’ .E], [E .E + T], [E .T], [T
.T * F], [T .F], [F .(E)], [F .id]}
113
Constructing SLR parsing tables…
The Goto operation
The second useful function is Goto (I, X) where I is a
set of items and X is a grammar symbol.
Goto (I, X) is defined as the closure of all items
[A αX.β] such that [A α.Xβ] is in I.
Example:
I = {[E’ E.], [E E . + T]}
Then
goto (I, +) = {[E E +. T], [T .T * F], [T .F],
[F .(E)] [F .id]}
114
Constructing SLR parsing tables…
The set of Items construction
Below is given an algorithm to construct C, the
canonical collection of sets of LR (0) items for
augmented grammar G’.
Procedure Items (G’);
Begin
C := {Closure ({[S’ . S]})}
Repeat
For Each item of I in C and each grammar symbol X such
that Goto (I, X) is not empty and not in C do
Add Goto (I, X) to C;
Until no more sets of items can be added to C
End
115
Constructing SLR parsing tables…
Example: Construction of the set of Items for the
augmented grammar above G1’.
I0 = {[E’ .E], [E .E + T], [E .T], [T .T * F],
[T .F], [F .(E)], [F .id]}
I1 = Goto (I0, E) = {[E’ E.], [E E. + T]}
I2 = Goto (I0, T) = {[E T.], [T T. * F]}
I3 = Goto (I0, F) = {[T F.]}
I4 = Goto (I0, () = {[F (.E)], [E .E + T], [E .T],
[T .T * F], [T .F], [F . (E)], [F .id]}
I5 = Goto (I0, id) = {[F id.]}
I6 = Goto (I1, +) = {[E E + . T], [T .T * F], [T .F],
[F .(E)], [F .id]}
116
I7 = Goto (I2, *) = {[T T * . F], [F .(E)],
[F .id]}
I8 = Goto (I4, E) = {[F (E.)], [E E . + T]}
Goto(I4,T)={[ET.], [TT.*F]}=I2;
Goto(I4,F)={[TF.]}=I3;
Goto (I4, () = I4;
Goto (I4, id) = I5;
I9 = Goto (I6, T) = {[E E + T.], [T T . * F]}
Goto (I6, F) = I3;
Goto (I6, () = I4;
Goto (I6, id) = I5;
I10 = Goto (I7, F) = {[T T * F.]}
Goto (I7, () = I4;
Goto (I7, id) = I5;
I11= Goto (I8, )) = {[F (E).]}
Goto (I8, +) = I6;
Goto (I9, *) = I7;
117
LR(0) automation
Action of
shift/reduce
parser on
input: id*id
118
SLR table construction algorithm
1. Construct C = {I0, I1, ......, IN} the collection of the
set of LR (0) items for G’.
2. State i is constructed from Ii and
a) If [A α.aβ] is in Ii and Goto (Ii, a) = Ij (a is a
terminal) then action [i, a]=shift j
b) If [A α.] is in Ii then action [i, a] = reduce A α for
a in Follow (A) for A ≠ S’
c) If [S’ S.] is in Ii then action [i, $] = accept.
120
SLR table construction method…
Example: Construct the SLR parsing table for the
grammar G1’
Follow (E) = {+, ), $} Follow (T) = {+, ), $, *}
Follow (F) = {+, ), $,*}
E’ E
1 EE+T
2 ET
3 TT*F
4 TF
5 F (E)
6 F id
By following the method we find the Parsing table
used earlier. 121
State action goto
id + * ( ) $ E T F
0 S5 S4 1 2 3
1 S6 accept
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S5 S4 8 2 3
5 R6 R6 R6 R6
6 S5 S4 9 3
7 S5 S4 10
8 S6 S11
Legend: Si means shift to state i,
9 R1 reduce
Rj means S7 R1 byR1j
production 122
SLR parsing table
Exercise: Construct the SLR parsing table for
the following grammar:/* Grammar G2’ */
S’ S
SL=R
SR
L *R
L id
RL
123
Answer
C = {I0, I1, I2, I3, I4, I5, I6, I7, I8, I9}
I0 = {[S’ .S], [S .L = R], [S .R], [L .*R],
[L .id], [R .L]}
I1 = goto (I0, S) = {[S’ S.]}
I2 = goto (I0, L) = {[S L . = R], [R L . ]}
I3 = goto (I0, R) = {[S R . ]}
I4 = goto (I0, *) ={[L * . R] [L .*R], [L .id],
[R .L]}
I5 = goto (I0, id) ={[L id . ]}
I6 = goto (I2, =) ={[S L = . R], [R . L ], [L .*R],
[L .id]}
I7 = goto (I4, R) ={[L * R . ]}
124
I8 = goto (I4, L) ={[R L . ]}
goto (I4, *) = I4
goto (I4, id) = I5
I9 = goto (I6, R) ={[S L = R .]}
goto (I6, L) = I8
goto (I6, *) = I4
goto (I6, id) = I5
Follow (S) = {$} Follow (R) = {$, =} Follow (L) = {$, =}
We have shift/reduce conflict since = is in Follow (R)
and R L. is in I2 and Goto (I2, =) = I6
Every SLR(1) grammar is unambiguous, but there are many
unambiguous grammars that are not SLR(1).
G2’ is not an ambiguous grammar. However, it is not SLR. This
is because the SLR parser is not powerful enough to remember
enough left context to decide whether to shift or reduce when
it sees an =. 125
LR parsing: Exercise
Given the following Grammar:
(1) S A
(2) S B
(3) A a A b
(4) A 0
(5) B a B b b
(6) B 1
Construct the SLR parsing table.
Write the action of an LR parse for the following string
aa1bbbb
126