Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Syntax Analysis

Uploaded by

nelatid766
Copyright
© © All Rights Reserved
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Syntax Analysis

Uploaded by

nelatid766
Copyright
© © All Rights Reserved
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
You are on page 1/ 90

Syntax Analysis

By:
Trusha R. Patel
Asst. Prof.
CE Dept., CSPIT, CHARUSAT
Role of Parser

token Parse
Lexical Syntax tree
Source Rest of
program Analyzer Analyzer
Front end
(Scanner) getNextToken (Parser)

Symbol table

2
CFG (Context Free Grammar)
 CFG consists of terminals, nonterminals, start symbols and
productions

 Terminals
 Basic symbols from which strings are formed
 “token name” is synonym for “terminal”

 Nonterminals
 Syntactic variable that denote sets of strings

3
CFG (Context Free Grammar)
 CFG consists of terminals, nonterminals, start symbols and
productions

 Start symbol
 One nonterminal different from other
 Set of strings it denotes is the language generated by the grammar
 Its productions are listed first

 Production
 Specify the manner in which the terminal and nonterminal can combine to
form strings

4
CFG (Context Free Grammar)
 Production consist of
 Nonterminal called the “head” or “left side”
 Symbol 
 “body” or “right side” consisting of zero or more terminals and
nonterminals
CFG (Context Free Grammar)
 Grammar for arithmetic expression

expression  expression + term


expression  expression - term
expression  term
term  term * factor
term  term / factor
term  factor
factor  ( expression )
factor  id
Notational convention for grammar
 Symbols for terminals
 Lowercase letters a,b,c,…,z
 Operator symbols + * / etc.
 Punctuation symbols , ; etc.
 Digits 0,1,2,…,9
 Boldface strings id , if etc.

 Symbols for nonterminals


 Uppercase letters A,B,C,…,Z
S usually indicated start symbol
 Lowercase, italic names expr , stmt etc.

7
Notational convention for grammar
 X , Y , Z represents grammar symbols
either nonterminal or terminal

 u , v , … , z represents strings of terminals

 α , β , γ , represents strings of grammar symbols (terminal


and/or nonterminal)

A α1 | α2 | … | αk
 Aα1 , Aα2 , … , Aαk may be written as

 Unless stated, head of first production is start symbol


8
Language generated by grammar
 G : Grammar
L(G) : Language generated by grammar G

 A language generated by CFG is called CFL (Context Free


Language)

 Two grammar generate the same language, the grammars are


said to be equivalent

9
Derivation
 Beginning with the start symbol, each rewriting step replaces a
nonterminal by the body of one of its production

Grammar: E  E + E | id
E
String: id + id +id
Derivation: E
E+E E + E
E+E+E
id + E + E id
E E
id + id + E +
id + id + id
id id

10
Derivation
 ⇒ : derive in one step

 : derive in zero or more step

Grammar ( G ) : E  E + E | E * E | - E | ( E ) | id
Derive string : – ( id )

E ⇒ – E ⇒ – ( E ) ⇒ – ( id )
Can also be written as
E – ( id )

11
Derivation
 Lest most derivation
 Left most nonterminal will be first replace by its production
 Right most derivation (canonical derivation)
 Right most nonterminal will be first replace by its production

Grammar : E  E + E | E * E | - E | ( E ) | id
String : - ( id + id )

Left most derivation


E ⇒ –E ⇒ –(E) ⇒ –(E+E) ⇒ –(id+E) ⇒ –(id+id)
Right most derivation
E ⇒ –E ⇒ –(E) ⇒ –(E+E) ⇒ –(E+id) ⇒ –(id+id)

12
Reduction
 Specific substring matching with the production of nonterminal
will be replaced by that nonterminal

Grammar: E  E + E | id E
String: id + id +id
Derivation: id + id + id
E + id + id
E + E + id
E E
E + id
E+E
E E
E
id + id + id

13
Parse tree
 Graphical representation of derivation
 Parse tree for the string - ( id + id ) is

- E

( E )

E + E

id id

14
Ambiguity
 A grammar that produce more than one parse tree for some
string is said to be ambiguous grammar
 more than one left most derivation or more than one right
most derivation
E E

* E E + E
E
Grammar:
E  E + E | E * E | id E E
id id
E E *
String : +
id + id * id id id
id id

15
CFG vs. RE
 Grammar are more powerful than RE

 Everything that can described by a RE can be described by a


Grammar, but not vice-versa

 Every regular language is context free language but not vice-


versa

16
CFG vs. RE
 RE : (a|b)*abb

 Grammar :
S  aX | aS | bS
X  bY

Z  ϵ
Y  bZ

17
Left recursion
 A grammar is left recursive if it has a nonterminal A such that
there is a derivation A Aα for some string α.

 Top-down parsing methods cannot handle left-recursive


grammar, so, a transformation that eliminates left recursion is
needed.

 Eliminate left recursion

i/p : Grammar with left recursion : AAa|b

o/p : Grammar without left recursion : A  b A’


A’  a A’ | ϵ
18
Left factoring
 Left factoring is a grammar transformation that is useful for
producing a grammar suitable for predictive parsing.

 If A  αβ1 | αβ2 are two productions of A and


the input string begins with a nonempty string derived from α,
we do not know whether to expand A to αβ1 or αβ2.

 Left factoring

i/p : Non left factored grammar : A  α β1 | α β2

o/p : Left factored grammar : A  α A’


A’  β1 | β2
19
General types of parser

Universal Top-down Bottom-up


Parser Parser Parser

• General method
• Can parse any grammar
• Methods such as
• Cocke-Younger-Kasami algorithm
• Earley’s algorithm

20
General types of parser

Universal Top-down Bottom-up


Parser Parser Parser

• Scan string from left to right


• Build parse tree from top (root) to the bottom (leaves)
• Perform derivation

21
General types of parser

Universal Top-down Bottom-up


Parser Parser Parser

• Scan string from left to right


• Start from leaves and word up to root
• Perform reduction

22
Top-Down Parsing
 Construct parse tree for the input string starting from root and
creating the nodes of parse tree in preorder (derivation)

Grammar ( G ) : E  E + E | E * E | - E | ( E ) | id
String : id + id * id
E

E + E

id E * E

id id

23
Different Top-Down Parsing Techniques
1. Recursive-Decent Parsing ( RDP )
2. Predictive Parsing

24
1. Recursive-Decent Parsing ( RDP )
 Require backtracking to find correct production to be applied
 Left recursive grammar can cause RDP to go into an infinite
loop

25
1. Recursive-Decent Parsing ( RDP )
 Algorithm
void A( )
{
choose an A-production, AX1,X2,…,Xk ;
for ( i = 1 to k)
{
if ( Xi is a nonterminal)
call procedure Xi( );
else if ( Xi equals the current input symbol α )
advance the input to the next symbol;
else
/* error occurred */ ;
}
}

26
1. Recursive-Decent Parsing ( RDP )
 Process:
 Maintain 2 pointer
 Lookahead pointer (LP) (point to top element of stack)
 Input pointer (IP) (point to symbol in input string)

 If nonterminal in stack (pointed by LP) then


replace it by its production, and LP point to left most symbol in
production
 If terminal in stack (pointed by LP) then
compare stack and input (pointed by LP and IP)
 If match then
advance both pointers (LP and IP)
 If not match then
backtrack

27
1. Recursive-Decent Parsing ( RDP )
S S S
S S

LP c A d c A d c A d c A d

LP LP a b a
LP LP

LP LP LP

Grammar:
ScAd String : c a d String Match

backtrack
Aab|a
IP IP IP
28
FIRST and FOLLOW
 Used to construct top-down and bottom-up parser

 FIRST ( α ) :
Set of terminals that begin strings derived from α

 FOLLOW ( α ) :
Set of terminals that can appear immediately to the right of α

29
FIRST
FIRST ( α )

Terminal Non Terminal


Look production of α
FIRST ( α ) =
{α}
αϵ αβγ

FIRST ( α ) =
{ϵ} Terminal Non Terminal
FIRST ( α ) = FIRST ( α ) = FIRST
{β} (β)

Contain ϵ

FIRST ( α ) = FIRST ( β ) U FIRST


(γ)
30
FOLLOW
FOLLOW ( α )

start Non Terminal

FOLLOW ( α )  Find α in RHS of Grammar


{$} βαγ

ϵ Terminal Non Terminal

FOLLOW ( α )  FOLLOW FOLLOW ( α )  FOLLOW ( α )  FIRST


(β) {γ} (γ)

Contain ϵ

FOLLOW ( α )  FIRST ( γ ) U FOLLOW


(γ)
31
2. Predictive Parsing
 Specific case of RDP
 No backtracking is required
 Choose the correct production by looking ahead at the input a
fixed number of symbols
 A class of grammar for which predictive parser can be
constructed with looking k symbols ahead in the input is called
LL(k) class
“k” input symbols of lookahead
Left most derivation
Left to right scan of input string

32
2. Predictive Parsing
 LL(1) grammar
 Cover most programming constructs
 Properties
 Unambiguous
 No left-recursion

33
2. Predictive Parsing
 How to construct Predictive Parsing Table

 Find FIRST and FOLLOW set for all nonterminals

 If “a” is in FIRST(X) then write production of X which can derive Xaβ


in [ X , a ]

place production (which give ϵ in FIRST(X)) in all α ∊ FOLLOW(X)


 If “ϵ” is in FIRST(X) then see the follow set of X

34
2. Predictive Parsing
FIRST ( E’ ) = { + , ϵ }
Grammar: FIRST( E ) = { id , ( } FOLLOW ( E)={$,)}

E’  + T E’ | ϵ
E  T E’ FOLLOW ( E’ ) = { $ , ) }

FIRST ( T’ ) = { * , ϵ }
FIRST ( T ) = { id , ( } FOLLOW ( T ) = { $, ) , + }
FOLLOW ( T’ ) = { $ , ) , + }
T’  * F T’ | ϵ
T  F T’
FIRST ( F ) = { id , ( } FOLLOW ( F)={$,),+,*}
F  ( E ) | id

Terminal
Nonterminal
id + * ( ) $
E TE’ TE’
E’ +TE’ ϵ ϵ
T FT’ FT’
T’ ϵ *FT’ ϵ ϵ
F id (E)
ll cell contain one and only one production so grammar is LL(1)
35
2. Predictive Parsing
(1) Parse the string id+id STACK INPUT OUTPUT
$E id + id $
$ E’ T id + id $ E  T E’
$ E’ T’ F id + id $ T  F T’
$ E’ T’ id id + id $ F  id
$ E’ T’ + id $
$ E’ + id $ T’  ϵ
$ E’ E’ T + + id $ E’  + T E’
$ E’ E’ T id $
$ E’ E’ T’ F id $ T  F T’
$ E’ E’ T’ id id $ F  id
$ E’ E’ T’ $
$ E’ E’ $ T’  ϵ
$ E’ $ E’  ϵ
$ $ E’  ϵ

36
2. Predictive Parsing
(2) Parse the string (id+id)*id $ E’ T’ ) E’ T id ) * id$
STACK INPUT OUTPUT $ E’ T’) E’ T’ F id ) * id$ TFT’
$E ( id + id ) * id $ E’ T’ ) E’ T’ id ) * id$ Fid
$ id
$ E’ T ( id + id ) * id ETE’ $ E’ T’ ) E’ T’ ) * id$
$
$ E’ T’) E’ ) * id$ T’ϵ
$ E’ T’ F ( id + id ) * id TFT’
$ $ E’ T’ ) ) * id$ E’ϵ
$ E’ T’ ) E ( ( id + id ) * id F(E) $ E’ T’ * id$
$ $ E’ T’ F* * id$ T’*FT’
$ E’ T’ ) E id + id ) * id $ $ E’ T’ F id$
$ E’ T’ ) E’ T id + id ) * id $ (E)TE’ $ E’ T’ id id$ Fid
$ E’ T’ ) E’ T’ id + id ) * id $ TFT’ $ E’ T’ $
F
$ E’ $ T’ϵ
$ E’ T’ ) E’ T’ id + id ) * id $ Fid
id $ $ E’ϵ
37
$ E’ T’ ) E’ T’ + id ) * id $
2. Predictive Parsing
FIRST ( B’ ) = { or , ϵ }
Grammar: FIRST( be ) = { not , ( , true , false
FOLLOW
} ( be ) = { $ , ) }
be  be or bt | bt FOLLOW ( B’ ) = { $ , ) }

FIRST ( A’ ) = { and , ϵ }
bt bt and bf | bf FIRST ( bt ) = { not , ( , true , false
FOLLOW
} ( bt ) = { $ , ) , or }
FOLLOW ( A’ ) = { $ , ) , or }
bf  not bf | ( be ) | true | false
FIRST ( bf ) = {not , ( , true , false
FOLLOW
} ( bf ) = { $ , ) , or , and }
Remove left recursion
Terminal
Nonter
Grammar: minal or and not ( ) tru fals $

B’  or bt B’ | ϵ
be  bt B’ e e
be bt B’ bt B’ bt bt

A’  and bf A’ | ϵ
B’ B’
ϵ ϵ
bt  bf A’
B’ or bt
bf  not bf | ( be ) | true | false B’
bt bf A’ bf A’ bf bf
A’ A’
All cell contain
A’ ϵ and
one bf one production ϵso grammar is LL(1)
only
and ϵ
38 A’
2. Predictive Parsing
FIRST ( S’ ) = { e , ϵ }
Grammar: FIRST ( S ) = { i , a } FOLLOW ( S ) = { e , $ }

S’  e S | ϵ
S  i E t S S’ | a FOLLOW ( S’ ) = { e , $ }
FIRST ( E ) = { b } FOLLOW ( E ) = { t }
Eb

Terminal
Nontermin
al i t a e b $

S i E t S S’ a

ϵ
ϵ
eS
S’

E b
Multiple production in cell so grammar is not LL(1)

39
2. Predictive Parsing
Grammar: FIRST ( S ) = { ( , a } FOLLOW ( S ) = { $ , , , ) }

FIRST ( L’ ) = { , , ϵ }
S(L)|a FIRST ( L ) = { ( , a } FOLLOW ( L ) = { ) }
LL,S|S FOLLOW ( L’ ) = { ) }

Remove left recursion


Terminal
Nontermin
al ( ) a , $
Grammar:
S(L)|a
S (L) a
L’  , S L’ | ϵ
L  S L’
L S L’ S L’
L’ ϵ , S L’

All cell contain one and only one production so grammar is LL(1)

40
2. Predictive Parsing
Grammar: FIRST ( D ) = { int , float} FOLLOW ( D)={$}
Space
L’ ) = { , , ϵ}
D  type list ; FIRST ( list ) = { id } FOLLOW ( list ) = { ; } Space
list  list , id | id FIRST ( FOLLOW ( L’ ) = { ; }
FIRST ( type ) = { int , float FOLLOW
} ( type ) = { ‘ ’ }
type  int | float

Remove left recursion Terminal


Nontermi
nal ; id , int float ‘ ’ $
Grammar:
D  type list ; type type
D

L’  , id L’ | ϵ
list  id L’ list ; list ;
list id L’
type  int | float L’ ϵ , id L’
All cell contain
type one and only one production
int so grammar
float is LL(1)

41
Bottom-Up Parsing
 Construct parse tree for the input string starting at the leaves
(bottom) and working up towards the root (top) (reduction)

Grammar ( G ) : E  E + E | E * E | - E | ( E ) | id
String : id + id + id
E

E E

E E

id + id + id

42
Bottom-Up Parsing
 Handle
 Handle of the string is a substring that matches with RHS of
production whose reduction by LHS of production represents one

 If then production in the portion following α is a handle of αβw


step along the reverse of a right most derivation

String Handle Reducing


production
Grammar : EE+T|T id * id id F  id
TT*F|F
F * id F TF
F  ( E ) | id
T * id id F  id
String: id * id
T*F T*F TT*F
T T ET
43
Bottom-Up Parsing
 Handle Pruning
 A right most derivation in reverse can be obtain by handle pruning

44
Different Bottom-Up Parsing Techniques
1. Shift-ReduceParsing
2. Operator Precedence Parsing
3. LR Parsing
1) Simple LR ( SLR or LR(0) )
2) Canonical LR ( CLR or LR(1) )
3) Lookahead LR ( LALR )

45
1. Shift-Reduce Parsing
 Stack holds grammar symbols
 Input buffer holds the string to be parsed
 Handle always appears at the top of the stack
 Use $ to mark bottom of the stack and also the right end of
the input
 Process:

symbols onto the stack, until it is ready to reduce a string β


 During left to right scan of input string, shift zero or more input

 The reduce β to the head (LHS) of the appropriate production


 Repeats this cycle until detect error or until stack contain start
symbol and input is empty

46
1. Shift-Reduce Parsing
 There are 4 possible actions

 Shift
 Shift the next input symbol onto the top of the stack

 Reduce
 Replace handle with LHS in the stack

 Accept
 Parsing complete successfully

 Error
 Discover a syntax error and call an error recovery routine

47
1. Shift-Reduce Parsing
Stack Input Action

Grammar :

E  E + E | E * E | id

String:

id + id * id

48
1. Shift-Reduce Parsing
Stack Input Action
$ id + id * id $ Shift
$ id + id * id $ Reduce E  id
Grammar :
$E + id * id $ Shift
E  E + E | E * E | id $E+ id * id $ Shift
$ E + id * id $ Reduce E  id
String:
$E+E * id $ Shift
id + id * id $E+E* id $ Shift
$ E + E * id $ Reduce E  id
$E+E*E $ Reduce E E * E
$E+E $ Reduce E  E + E
$E $ Accept

49
1. Shift-Reduce Parsing
 Conflict during shift reduce parsing

 Shift / reduce conflict


 Cannot decide whether to shift or to reduce

 Reduce / reduce conflict


 Cannot decide which of several reduction to make

50
2. Operator Precedence Parsing
 Operator grammar
 The grammar has the property (among other essential requirements)
that no production right side is ϵ or has two adjacent nonterminals.

E.g. E  EAE | (E) | -E | id


Not a operator grammar as EAE as consecutive nonterminals
A+|-|*|/|^

E  E+E | E-E | E*E | E/E | E^E | (E) | -E Equivalent


| id operator grammar

51
2. Operator Precedence Parsing

by disjoint relation symbols ⋗,⋖ and ≐


 Define precedence relation between pair of terminals

t1 ≐ t2
 t1 ⋖ t2
 t1 has same priority as t2

 t1 ⋗ t2
t1 has less priority than t2
t1 has high priority than t2

 t1 ⋖ t2 and t2 ⋗ t1 are not same always

52
2. Operator Precedence Parsing
 How to parse string (using operator precedence table)

1. Construct operator precedence table


2. Place $ (imaginary terminal marking) at staring and ending of string
(mark each end of string)

Scan the string form left to right until first ⋗ is encounter


3. Put relation between each symbols in string

Then scan back over any ≐ until ⋖ is encounter


4.

Handle is every thing between ⋖ and ⋗ reduce to LHS of


5.
6.
appropriate production

53
2. Operator Precedence Parsing
Grammar: String:
E  E + E | E * E | id
$ ⋖ id ⋗+ ⋖ id ⋗ *⋖ id ⋗ $

⋖ E ⋖ id ⋗ *⋖ id ⋗ $
id is replaced with E
$ +
operator precedence table Now compare $ + id * id $

Right side $ ⋖ E + E ⋖ *⋖ id⋗ $

⋖ E E ⋖ E ⋗ $
id + * $
id ⋗ ⋗ ⋗ $ + *
⋖ ⋗ ⋖ ⋗
Left side

Left + has high priority


⋖ E ⋗$
+
⋖ ⋗ ⋗ ⋗
then right + $ + E
*
$ ⋖ ⋖ ⋖
$ E $

54
2. Operator Precedence Parsing
 Algorithm : operator precedence parsing

 Method :
Initially the stack contains $ and the input buffer the string w$. To parse we
execute the below program
1. Set ip to point to the first symbol of w$:
2. Repeat forever
3. if $ is on top of the stack and ip points to $ then
4. return
5. else begin
6. let a be the topmost terminal symbol on the stack and b be the

if a ⋖ b or a ≐ b the begin
symbol pointed by ip
7.
8. push b onto the stack
55
2. Operator Precedence Parsing
 Algorithm : operator precedence parsing

 Method :
9. advance ip to the next input symbol

else if a ⋗ b then
10. end
11.
12. repeat

until the top stack terminal is related by ⋖ to the terminal most


13. pop the stack
14.
recently popped
15. else
16. error()
17. end

56
2. Operator Precedence Parsing
 Operator precedence function
 Precedence between “a” and “b” can be determined by numerical

 f(a) = g(b) if a ≐ b
comparison function f and g

 f(a) < g(b) if a ⋖ b


 f(a) > g(b) if a ⋗ b

57
2. Operator Precedence Parsing
 Algorithm : construct precedence functions

 Input :
An operator precedence matrix

 Output :
Precedence functions representing the input matrix, or an indication
that none exist

58
2. Operator Precedence Parsing
 Algorithm : construct precedence functions

 Method :
1. Create symbol “fa” and ga” for each terminal “a” and $

that if a ≐ b then “fa” & “gb” are in same group


2. Partitions the created symbols into as many group as possible in a such a way

3. Create a directed graph whose nodes are the groups found in step-2

if a ⋖ b then place an edge from group “gb” to group “fa”


for any “a” and “b”

if a ⋗ b then place an edge from group “fa” to group “gb”


4. If graph is constructed in step-3 has a cycle then no precedence function
exist. If there are no cycle then let f(a) be the length of the longest path
beginning at the group of “fa” and g(a) be the length of the longest path from
beginning at the group of “ga”
59
2. Operator Precedence Parsing

gi
fid
Right side d
------- g -------
id + * $
⋗ ⋗ ⋗
f* g*
id
⋖ ⋗ ⋖ ⋗
Left side
------- f

+
-------

* ⋖ ⋗ ⋗ ⋗ g Find max path to reach


⋖ ⋖ ⋖
f+ either f$ or g$
$ +

id + * $
Draw edge from grater to less f$ g$ f 4 2 4 0
e.g. F(+) > g(+) so edge from f+ to g+
g 5 1 3 0

60
2. Operator Precedence Parsing
Parse string id + id * id

$ id + id * id $
0 5 2 5 4 5 0

$ E + E * E $
0 2 4 0

$ E + E $
0 2 0

$ E $ id + * $
0 0 f 4 2 4 0
g 5 1 3 0

61
2. Operator Precedence Parsing
operator precedence table
Right side
Grammar: id + - * / ^ ( ) $
id ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
⋖ ⋗ ⋗ ⋖ ⋖ ⋖ ⋖ ⋗ ⋗
E  E + E | E – E | E * E | E / E | E ^ E | ( E ) | id
+
⋖ ⋗ ⋗ ⋖ ⋖ ⋖ ⋖ ⋗ ⋗

Left side
-
* ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
/ ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
^ ⋖ ⋗ ⋗ ⋗ ⋗ ⋖ ⋖ ⋗ ⋗
( ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ≐
) ⋗ ⋗ ⋗ ⋗ ⋗ ⋗ ⋗
$ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖ ⋖

62
3. LR parsing
 LR : left to right scan Right most derivation

 Can handle left recursive grammar

 Can not handle the ambiguous grammar

63
3. LR parsing
 3 types

 Simple LR (SLR)
 Can solve LR(0) grammars

 Canonical LR (CLR)
 Can solve LR(1) grammars

 Lookahead LR (LALR)

64
3. LR parsing : (1) SLR
 Weakest methods from all 3 LR methods

 Process
 Convert grammar in augmented grammar by adding one more
starting symbol S’  S
 Construct LR(0) item sets
 Construct parsing table

65
3. LR parsing : (1) SLR
 How to construct item set

 Production A  XYZ can have 4 forms


A  . XYZ
A  X .YZ
A  X Y.Z
A  X YZ.

 Production A ϵ can generate only one item A  .

66
3. LR parsing : (1) SLR
 How to construct item set

 Closure function
 If I is a set of items for a grammar G then closure(I) is constructed by two
rules

A  α . B β is in closure(I) & B  γ is the production rule then B  . γ is added in


1. Initially every item in I is in closure of I
2.
closure(I)
continue this rule till no new item can be added

 GOTO function
 goto( I , x ) where “I” is item set and “x” is grammar symbol
 goto( I , x ) = closure of the set of all items [ A  α x . β ] such that [ A  α . x
β ] is in I
67
3. LR parsing : (1) SLR
 How to construct parsing table

1. If S’S. is in Ii then set action[ i , $ ] to accept

2. If A  α . x β is in Ii , where “x” is terminal , and goto( Ii , x )= Ij then


set action[ i , x ] to “shift j”

3. If A  α . B β is in Ii, where B is nonterminal then


set goto[ i , B ] to the j where we are having first production A  α B . β in Ij

4. If A  α . is in Ii then set action[ i , x ] to “reduce Aα” for all “x” in follow(A)

68
3. LR parsing : (1) SLR
Grammar: Make augmented Augmented Grammar:
grammar
EE+T|T E’E
TT*F|F EE+T|T
F  ( E ) | id TT*F|F
F  ( E ) | id
Added new start symbol
E’

Give numbers to production 0) E’E Find follow set for FOLLOW( E ) = { $ , ) , +}


1)
(will need it in table generation) EE+T all nonterminal FOLLOW( T ) = { $ , ) , + , * }
2) ET FOLLOW( F ) = { $ , ) , + , * }
3) TT*F
4) TF
5) F(E)
6) F  id

69
ItsE’
Its
 .E. has
same
After
new there .E
is I2setsoso
are 5add
don’t production
give
newnew
possibilities E of
TI1FE( give
name, and id
After .item
there is so
E give
F (nonterminal)
T name
name Production
EPrepare
.TI2has .T sowith
goto add “.productions
E” of T
SameSo add whenfor
wayproduction allof
find newF item set give new na
E
T
3. LR parsing : (1) SLR T become
Same .Fwhen
has .F “E
so.”add
item productions
set match of F
with any
previous item set give same name
Construct LR(0) item sets I4 =goto( I0 , ( ) =F  ( . E ) I8 =goto( I4 , E ) =F  ( E . )
Io = E ’  . E E.E+T EE.+T
E.E+T E.T
T.T*F I2 =goto( I4 , T ) =E  T .
E.T TT.*F
T.T*F T.F
F  .( E )
T.F I3 =goto( I4 , F ) = T  F .
F.(E) F  . id
F  . id I5 =goto( I0 , id ) =F  id . I4 = goto( I4 , ( ) = F  ( . E )
E.E+T
I1 =goto( I0 , E ) =E ’  E . goto( I1 , + ) =E  E + . T
I6 = E.T
EE.+T T.T*F T.T*F
T.F T.F
I2 =goto( I0 , T ) =E  T . F  .( E ) F  .( E )
TT.*F F  . id F  . id

I7 =goto( I2 , * ) =T  T * . F I5 =goto( I4 , id ) =F  id .
I3 =goto( I0 , F ) = T  F .
F  .( E )
F  . id
70
New item set so give new name
3. LR parsing : (1) SLR
Construct LR(0) item sets
I4 =goto( I7 , ( ) =F  ( . E ) I6 =goto( I8 , + ) =E  E + . T
E.E+T T.T*F
I9 =goto( I6 , T ) =E  E + T .
E.T T.F
TT.*F
T.T*F F.(E)
I3 =goto( I6 , F ) = T  F . T.F F  . id
F.(E)
I4 =goto( I6 , ( ) =F  ( . E ) F  . id
E.E+T
I7 =goto( I9 , * ) =T  T * . F
E.T I5 =goto( I7 , id ) =F  id . F.(E)
T.T*F
F  . id
T.F
F.(E) I11 =
goto( I8 , ) ) = F  ( E ) .
F  . id

I5 =goto( I6 , id ) =F  id .

I10 =
goto( I7 , F ) =T  T * F .

71
3. LR parsing : (1) SLR
I E I + I T I *
I7
0 1 6 9
( F
I3
I ) E ( (
1 I8 I4 I4
1
+ id
I6 I5

T
I2
T I * I F I
F 1
2 7
I3 0
( I
id 4
I5
id I
5
id F I
I5
3

72
I3 has item TF. (dot I1
I5 =atgoto
the end)
( I0,E
I0,id))
Production numberinof0,E 0,id
TFdo
doisentry
entry
4 of of1S5 (shift 5)
3. LR parsing : (1) SLR Follow(T)={+,*,),$}
In (3,+) (3,*) (3,)) (3,$) do entry of R4 (reduce 4)
action goto
Item
set id + * ( ) $ E T F

0 S5 S4 1 2 3
1 S6 Acc
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S5 S4 8 2 3
5 R6 R6 R6 R6
6 S5 S4 9 3
7 S5 S4 10
8 S6 S11
9 R1 S7 R1 R1
10 R3 R3 R3 R3
73 11 R5 R5 R5 R5
3. LR parsing : (1) SLR
Stack Input Action
0 id * id + id $ Shift
0 id 5 * id + id $ Reduce by F  id
0F3 * id + id $ Reduce by T  F
0T2 * id + id $ Shift
In parsing table entry of (0,id) is S5 0T2*7 id + id $ shift
So Action is Shift
Shift one element from input to stack 0 T 2 * 7 id 5 + id $ Reduce by F  id
Place 5 after that in stack 0 T 2 * 7 F 10 + id $ Reduce by T  T * F
0T2 + id $ Reduce by E  T
In parsing table entry of (3,*) is R6
So Action is Reduce with production 6 (Fid) 0E1 + id $ Shift
In stack find id and replace with F 0E1+6 id $ Shift
In stack it become 0F
0 E 1 + 6 id 5 $ Reduce by F  id
In parsing table entry of (0,F) is 3
Place 3 in stack 0E1+6F5 $ Reduce by T  F
0E1+6T9 $ Reduce by E  E + T
0E1 $ Accept

74
3. LR parsing : (2) CLR
 Process
 Convert grammar in augmented grammar by adding one more
starting symbol S’  S
 Construct LR(1) item sets
 Construct parsing table

75
3. LR parsing : (2) CLR
 How to construct item set

 Closure function
 If I is a set of items for a grammar G then closure(I) is constructed by two
rules

A  α . B β , a is in closure(I) & B  γ is the production rule


1. Initially every item in I is in closure of I

then B  . γ , FIRST(βa) is added in closure(I) continue this rule till no new item
2.

can be added

 GOTO function
 goto( I , x ) where “I” is item set and “x” is grammar symbol
 goto( I , x ) = closure of the set of all items [ A  α x . β , a ] such that [A  α .
x β , a ] is in I
76
3. LR parsing : (2) CLR
 How to construct parsing table

1. If S’S. , $ is in Ii then set action[ i , $ ] to accept

2. If A  α . x β , a is in Ii , where “x” is terminal , and goto( Ii , x )= Ij then


set action[ i , x ] to “shift j”

3. If A  α . B β is in Ii, where B is nonterminal then


set goto[ i , B ] to the j where we are having first production A  α B . β , a in Ij

4. If A  α . , a is in Ii then set action[ i , a ] to “reduce Aα”

77
3. LR parsing : (2) CLR
Grammar: Make augmented Augmented Grammar:
grammar
S  CC S’S
C  cC | d S  CC
C  cC | d

Give numbers to production 0) S’  S Find first set for FIRST ( S ) = { c , d }


(will need it in table generation)1) S  CC all nonterminal FIRST ( C ) = { c , d }
2) C  cC
3) C d

78
S’.S need to compare with A  α.Bβ , a
here β is ϵ and a is $ , so FIRST(βa)
3. LR parsing : (2) CLR = {$}
$ is added in look ahead of S  .CC
Construct LR(1) item sets I4 =goto( I0 , d ) =C  d. , c/d I3 =goto( I3 , c ) =C  c.C , c/d
C  .cC ,
Io = S ’  . S , $ c/d
S  . CC , $ I5 =goto( I2 , C ) =S  CC. , $ C  .d , c/d
C  .cC , c/d I4 =goto( I3 , d ) =C  d. , c/d
C  .d , c/d
I6 =goto( I2 , c ) =C  c.C , $
C  .cC , $ I9 = goto( I6 , C ) =C  cC. , $
I1 =goto( I0 , S ) =S ’  S. , $
C  .d , $
I6 = goto( I6 , c ) =C  c.C , $
I2 =goto( I0 , C ) =S  C.C , $ C  .cC , $
C  .cC , I7 =goto( I2 , d ) =C  d. , $ C  .d , $
$
C  .d , $ I8 =goto( I3 , C ) =C  cC. , c/d I7 =goto( I6 , d ) =C  d. , $
I3 =goto( I0 , c ) =C  c.C , c/d
C  .cC ,
c/d
C  .d , c/d
79
3. LR parsing : (2) CLR

S
I0 I1

C C
I2 I5
c c

C c c C
I8 I3 I6 I9

d d d
I4 I7 I7

d
I4

80
3. LR parsing : (2) CLR

I3 = goto (I0,c) action goto


Item
So do entry of shift3(S3) in (0,c) c d $ S C
set
I2 = goto (I0,C)
0 S3 S4 1 2
So do entry of 2 in (0,C)
1 Acc
2 S6 S7 5
I4 contain production Cd.,c/d
3 S3 S4 8
Dot at the end so need to do reduce entry
Cd has production number 3 4 R3 R3
Look ahead are c and d 5 R1
So do entry of R3 in (4,c0 and (4,d)
6 S6 S7 9
7 R3
8 R2 R2
9 R2

81
3. LR parsing : (3) LALR
 Process
 Convert grammar in augmented grammar by adding one more
starting symbol S’  S
 Construct LR(1) item sets
 Combine the item sets having same core but different lookahead
 Construct parsing table

82
3. LR parsing : (3) LALR
Grammar: Make augmented Augmented Grammar:
grammar
S  CC S’S
C  cC | d S  CC
C  cC | d

Give numbers to production 0) S’  S Find first set for FIRST ( S ) = { c , d }


(will need it in table generation)1) S  CC all nonterminal FIRST ( C ) = { c , d }
2) C  cC
3) C d

83
3. LR parsing : (3) LALR
Construct LR(1) item sets
I6 = C  c.C , $
Io = S ’  . S , $ I3 = C  c.C , c/d I36 =C  c.C , c/d/$
C  .cC , $
S  . CC , $ C  .cC , C  .cC , C  .d , $
C  .cC , c/d c/d c/d/$
C  .d , c/d C  .d , c/d C  .d , c/d/$

I7 = C  d. , $
I47 =C  d. , c/d/$
I1 = S ’  S. , $
I4 = C  d. , c/d

I8 = C  cC. , c/d
I2 = S  C.C , $
C  .cC , I5 = S  CC. , $ I89 =C  cC. , c/d/$
$
C  .d , $ I9 = C  cC. , $

84
3. LR parsing : (3) LALR

action goto
Item
set c d $ S C

0 S36 S47 1 2
1 Acc
2 S36 S47 5
36 S36 S47 89
47 R3 R3 R3
5 R1
89 R2 R2 R2

85
Using Ambiguous Grammar
 Every ambiguous grammar fails to be LR

 Certain types of ambiguous grammars are useful in the


specification and implementation of languages

 Ambiguous grammar provides a shorter, more natural


specification than any equivalent unambiguous grammar

86
Using Ambiguous Grammar
 Ambiguous grammar  Equivalent unambiguous
E  E + E | E * E | (E) | id grammar
EE+T|T
TT*F|F
F  (E) | id

 Grammar is ambiguous
because it does not specify the  Generates same language but
associativity and precedence of give + a lower precedence
the operators + and * than * and makes both
operators left associative

87
Using Ambiguous Grammar
 Reasons : might want to use ambiguous grammar instead of
unambiguous

1. Can easily change the associativities and precedence levels of the


operators without disturbing the production of ambiguous grammar
or the number of states in the resulting parser.

2. Unambiguous grammar will spend a large fraction of time reducing


by the productions ET and TF.
The parser for ambiguous grammar will not waste time reducing by
these single productions.

88
Syntax Analyzer Generator YACC

Yacc source file Yacc


y.tab.c
(.y) Compiler

C
y.tab.c a.out
Compiler

Input stream a.out Sequence of token

89
Structure of Yacc Program
declarations
%%
translation rules
%%
Supporting c-routines

90

You might also like