Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

AT&CD Unit 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Sir C R Reddy College of Engineering Department of Information Technology

UNIT - II
Context Free grammars and parsing: Context free grammars, derivation, parse trees,
ambiguity LL(K) grammars and LL(1) parsing.
Bottom up parsing handle pruning LR Grammar Parsing, LALR parsing, parsing ambiguous
grammars, YACC programming specification.

Context free grammars


Context free grammar is a formal grammar, which is used to generate all possible strings
in a given formal language.
Context free grammar G can be defined by four tuples as:
G= (V, T, P, S)
Where,
G describes the grammar
T describes a finite set of terminal symbols.
V describes a finite set of non-terminal symbols
P describes a set of production rules
S is the start symbol.
In CFG, the start symbol is used to derive the string. You can derive the string by repeatedly
replacing a non-terminal by the right hand side of the production, until all non-terminal have
been replaced by terminal symbols.
Example:
L= {wcwR | w € (a, b)*}
Production rules:
S → aSa
S → bSb
S→c
Now check that abbcbba string can be derived from the given CFG.
S ⇒ aSa
S ⇒ abSba
S ⇒ abbSbba
S ⇒ abbcbba
By applying the production S → aSa, S → bSb recursively and finally applying the production S
→ c, we get the string abbcbba.
Capabilities of CFG
There are the various capabilities of CFG:
o Context free grammar is useful to describe most of the programming languages.
o If the grammar is properly designed then an efficient parser can be constructed
automatically.
o Using the features of associatively & precedence information, suitable grammars for
expressions can be constructed.
o Context free grammar is capable of describing nested structures like: balanced
parentheses, matching begin-end, corresponding if-then-else's & so on.

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Derivation
Derivation is a sequence of production rules. It is used to get the input string through these
production rules. During parsing we have to take two decisions. These are as follows:
o We have to decide the non-terminal which is to be replaced.
o We have to decide the production rule by which the non-terminal will be replaced.
We have two options to decide which non-terminal to be replaced with production rule.
Left-most Derivation
In the left most derivation, the input is scanned and replaced with the production rule
from left to right. So in left most derivatives we read the input string from left to right.
Example:
Production rules:
S=S+S
S=S-S
S = a | b |c
Input:
a–b+c
The left-most derivation is:
S=S+S
S=S-S+S
S=a-S+S
S=a-b+S
S=a-b+c
Right-most Derivation
In the right most derivation, the input is scanned and replaced with the production rule
from right to left. So in right most derivatives we read the input string from right to left.
Example:
S=S+S
S=S-S
S = a | b |c
Input:
a-b+c
The right-most derivation is:
S=S-S
S=S-S+S
S=S-S+c
S=S-b+c
S=a-b+c

Parse tree
o Parse tree is the graphical representation of symbol. The symbol can be terminal or non-
terminal.
o In parsing, the string is derived using the start symbol. The root of the parse tree is that
start symbol.

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

o It is the graphical representation of symbol that can be terminals or non-terminals.


o Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So,
the operator in the parent node has less precedence over the operator in the sub-tree.
The parse tree follows these points:
o All leaf nodes have to be terminals.
o All interior nodes have to be non-terminals.
o In-order traversal gives original input string.
Example:

Production rules:
T= T + T | T * T
T = a|b|c
Input:
a*b+c
Step 1:

Step 2:

Step 3:

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Step 4:

Step 5:

Ambiguity
A grammar is said to be ambiguous if there exists more than one leftmost derivation or
more than one rightmost derivative or more than one parse tree for the given input string. If the
grammar is not ambiguous then it is called unambiguous.
Example:
S = aSb | SS
S=∈
For the string aabb, the above grammar generates two parse trees:

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

If the grammar has ambiguity then it is not good for a compiler construction. No method
can automatically detect and remove the ambiguity but you can remove ambiguity by re-writing
the whole grammar without ambiguity.
A grammar G is said to be ambiguous if it has more than one parse tree (left or right
derivation) for at least one string.
Example
E→E+E
E→E–E
E → id
For the string id + id – id, the above grammar generates two parse trees:
The language generated by an ambiguous grammar is said to be inherently ambiguous.
Ambiguity in grammar is not good for a compiler construction. No method can detect and
remove ambiguity automatically, but it can be removed by either re-writing the whole grammar
without ambiguity, or by setting and following associativity and precedence constraints.
Associativity
If an operand has operators on both sides, the side on which the operator takes this
operand is decided by the associativity of those operators. If the operation is left-associative, then
the operand will be taken by the left operator or if the operation is right-associative, the right
operator will take the operand.
Example
Operations such as Addition, Multiplication, Subtraction, and Division are left associative. If the
expression contains:
id op id op id
it will be evaluated as:
(id op id) op id
For example, (id + id) + id
Operations like Exponentiation are right associative, i.e., the order of evaluation in the same
expression will be:
id op (id op id)
For example, id ^ (id ^ id)
Precedence
If two different operators share a common operand, the precedence of operators decides
which will take the operand. That is, 2+3*4 can have two different parse trees, one
corresponding to (2+3)*4 and another corresponding to 2+(3*4). By setting precedence among
operators, this problem can be easily removed. As in the previous example, mathematically *
(multiplication) has precedence over + (addition), so the expression 2+3*4 will always be
interpreted as:
2 + (3 * 4)
These methods decrease the chances of ambiguity in a language or its grammar.
Left Recursion
A grammar becomes left-recursive if it has any non-terminal ‘A’ whose derivation
contains ‘A’ itself as the left-most symbol. Left-recursive grammar is considered to be a

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

problematic situation for top-down parsers. Top-down parsers start parsing from the Start
symbol, which in itself is non-terminal. So, when the parser encounters the same non-terminal in
its derivation, it becomes hard for it to judge when to stop parsing the left non-terminal and it
goes into an infinite loop.
Example:
(1) A => Aα | β
(2) S => Aα | β
A => Sd
(1) is an example of immediate left recursion, where A is any non-terminal symbol and α
represents a string of non-terminals.
(2) is an example of indirect-left recursion. A top-down parser will first parse the A, which in-
turn will yield a string consisting of A itself and the parser may go into a loop forever.
Removal of Left Recursion
One way to remove left recursion is to use the following technique:
The production
A => Aα | β
is converted into following productions
A => βA'
A'=> αA' | ε
This does not impact the strings derived from the grammar, but it removes immediate left
recursion.
Second method is to use the following algorithm, which should eliminate all direct and
indirect left recursions.
START
Arrange non-terminals in some order like A1, A2, A3,…, An
for each i from 1 to n
{
for each j from 1 to i-1
{
replace each production of form Ai ⟹Aj𝜸
with Ai ⟹ δ1𝜸 | δ2𝜸 | δ3𝜸 |…| 𝜸
where Aj ⟹ δ1 | δ2|…| δn are current Aj productions
}
}
eliminate immediate left-recursion
END
Example
The production set
S => Aα | β
A => Sd
after applying the above algorithm, should become

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

S => Aα | β
A => Aαd | βd
and then, remove immediate left recursion using the first technique.
A => βdA'
A' => αdA' | ε
Now none of the production has either direct or indirect left recursion.
Left Factoring
If more than one grammar production rules has a common prefix string, then the top-
down parser cannot make a choice as to which of the production it should take to parse the string
in hand.
Example
If a top-down parser encounters a production like
A ⟹ αβ | α𝜸 | …
Then it cannot determine which production to follow to parse the string as both
productions are starting from the same terminal (or non-terminal). To remove this confusion, we
use a technique called left factoring.
Left factoring transforms the grammar to make it useful for top-down parsers. In this
technique, we make one production for each common prefixes and the rest of the derivation is
added by new productions.
Example
The above productions can be written as
A => αA'
A'=> β | 𝜸 | …
Now the parser has only one production per prefix which makes it easier to take decisions.
First and Follow Sets
An important part of parser table construction is to create first and follow sets. These sets
can provide the actual position of any terminal in the derivation. This is done to create the
parsing table where the decision of replacing T[A, t] = α with some production rule.
First Set
This set is created to know what terminal symbol is derived in the first position by a non-
terminal. For example,
α→tβ
That is α derives t (terminal) in the very first position. So, t ∈ FIRST(α).
Algorithm for calculating First set
Look at the definition of FIRST(α) set:
 if α is a terminal, then FIRST(α) = { α }.
 if α is a non-terminal and α → ℇ is a production, then FIRST(α) = { ℇ }.
 if α is a non-terminal and α → 𝜸1 𝜸2 𝜸3 … 𝜸n and any FIRST(𝜸) contains t then t is in
FIRST(α).
Follow Set

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Likewise, we calculate what terminal symbol immediately follows a non-terminal α in


production rules. We do not consider what the non-terminal can generate but instead, we see
what would be the next terminal symbol that follows the productions of a non-terminal.
Algorithm for calculating Follow set:
 if α is a start symbol, then FOLLOW() = $
 if α is a non-terminal and has a production α → AB, then FIRST(B) is in FOLLOW(A)
except ℇ.
 if α is a non-terminal and has a production α → AB, where B ℇ, then FOLLOW(A) is in
FOLLOW(α).
Limitations of Syntax Analyzers
Syntax analyzers receive their inputs, in the form of tokens, from lexical analyzers. Lexical
analyzers are responsible for the validity of a token supplied by the syntax analyzer. Syntax
analyzers have the following drawbacks -
 it cannot determine if a token is valid,
 it cannot determine if a token is declared before it is being used,
 it cannot determine if a token is initialized before it is being used,
 it cannot determine if an operation performed on a token type is valid or not.

Parser
Parser is a compiler that is used to break the data into smaller elements coming from lexical
analysis phase.
A parser takes input in the form of sequence of tokens and produces output in the form of parse
tree.
Parsing is of two types: top down parsing and bottom up parsing.

Top down paring


o The top down parsing is known as recursive parsing or predictive parsing.
o Bottom up parsing is used to construct a parse tree for an input string.
o In the top down parsing, the parsing starts from the start symbol and transform it into the
input symbol.
Parse Tree representation of input string "acdb" is as follows:

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Bottom up parsing
o Bottom up parsing is also known as shift-reduce parsing.
o Bottom up parsing is used to construct a parse tree for an input string.
o In the bottom up parsing, the parsing starts with the input symbol and construct the parse
tree up to the start symbol by tracing out the rightmost derivations of string in reverse.
Example
Production
E→T
T→T*F
T → id
F→T
F → id
Parse Tree representation of input string "id * id" is as follows:

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Bottom up parsing is classified in to various parsing. These are as follows:


1. Shift-Reduce Parsing
2. Operator Precedence Parsing
3. Table Driven LR Parsing
a. LR( 1 )
b. SLR( 1 )
c. CLR ( 1 )
d. LALR( 1 )
Shift reduce parsing
o Shift reduce parsing is a process of reducing a string to the start symbol of a grammar.
o Shift reduce parsing uses a stack to hold the grammar and an input tape to hold the string.

o Sift reduce parsing performs the two actions: shift and reduce. That's why it is known as
shift reduces parsing.
o At the shift action, the current symbol in the input string is pushed to a stack.
o At each reduction, the symbols will replaced by the non-terminals. The symbol is the
right side of the production and non-terminal is the left side of the production.
Example:
Grammar:
S → S+S
S → S-S
S → (S)
S→a
Input string:
a1-(a2+a3)
Parsing table:

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

There are two main categories of shift reduce parsing as follows:


1. Operator-Precedence Parsing
2. LR-Parser
Operator precedence parsing
Operator precedence grammar is kinds of shift reduce parsing method. It is applied to a
small class of operator grammars.
A grammar is said to be operator precedence grammar if it has two properties:
o No R.H.S. of any production has a∈.
o No two non-terminals are adjacent.
Operator precedence can only established between the terminals of the grammar. It ignores the
non-terminal.
There are the three operator precedence relations:
a ⋗ b means that terminal "a" has the higher precedence than terminal "b".
a ⋖ b means that terminal "a" has the lower precedence than terminal "b".
a ≐ b means that the terminal "a" and "b" both have same precedence.
Precedence table:

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Parsing Action
o Both end of the given input string, add the $ symbol.
o Now scan the input string from left right until the ⋗ is encountered.
o Scan towards left over all the equal precedence until the first left most ⋖ is encountered.
o Everything between left most ⋖ and right most ⋗ is a handle.
o $ on $ means parsing is successful.
Example
Grammar:
E → E+T/T
T → T*F/F
F → id
Given string:
w = id + id * id
Let us consider a parse tree for it as follows:

On the basis of above tree, we can design following operator precedence table:

Now let us process the string with the help of the above precedence table:

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

LR Parser
LR parsing is one type of bottom up parsing. It is used to parse the large class of
grammars.
In the LR parsing, "L" stands for left-to-right scanning of the input.
"R" stands for constructing a right most derivation in reverse.
"K" is the number of input symbols of the look ahead used to make number of parsing
decision.
LR parsing is divided into four parts: LR (0) parsing, SLR parsing, CLR parsing and
LALR parsing.

LR algorithm:
The LR algorithm requires stack, input, output and parsing table. In all type of LR parsing, input,
output and stack are same but parsing table is different.

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Fig: Block diagram of LR parser


Input buffer is used to indicate end of input and it contains the string to be parsed
followed by a $ Symbol.
A stack is used to contain a sequence of grammar symbols with a $ at the bottom of the
stack.
Parsing table is a two dimensional array. It contains two parts: Action part and Go To
part.
LR (1) Parsing
Various steps involved in the LR (1) Parsing:
o For the given input string write a context free grammar.
o Check the ambiguity of the grammar.
o Add Augment production in the given grammar.
o Create Canonical collection of LR (0) items.
o Draw a data flow diagram (DFA).
o Construct a LR (1) parsing table.
Augment Grammar
Augmented grammar G` will be generated if we add one more production in the given
grammar G. It helps the parser to identify when to stop the parsing and announce the acceptance
of the input.
Example
Given grammar
S → AA
A → aA | b
The Augment grammar G` is represented by
S`→ S
S → AA
A → aA | b

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Collection of LR(0) items


An LR (0) item is a production G with dot at some position on the right side of the
production.
LR(0) items is useful to indicate that how much of the input has been scanned up to a
given point in the process of parsing.
In the LR (0), we place the reduce node in the entire row.
Example
Given grammar:
S → AA
A → aA | b
Add Augment Production and insert '•' symbol at the first position for every production in G
S` → •S
S → •AA
A → •aA
A → •b
I0 State:
Add Augment production to the I0 State and Compute the Closure
I0 = Closure (S` → •S)
Add all productions starting with S in to I0 State because "•" is followed by the non-terminal. So,
the I0 State becomes
I0 = S` → •S
S → •AA
Add all productions starting with "A" in modified I0 State because "•" is followed by the non-
terminal. So, the I0 State becomes.
I0= S` → •S
S → •AA
A → •aA
A → •b
I1= Go to (I0, S) = closure (S` → S•) = S` → S•
Here, the Production is reduced so close the State.
I1= S` → S•
I2= Go to (I0, A) = closure (S → A•A)
Add all productions starting with A in to I2 State because "•" is followed by the non-terminal.
So, the I2 State becomes
I2 =S→A•A
A → •aA
A → •b
Go to (I2,a) = Closure (A → a•A) = (same as I3)
Go to (I2, b) = Closure (A → b•) = (same as I4)
I3= Go to (I0,a) = Closure (A → a•A)
Add productions starting with A in I3.
A → a•A
A → •aA
A → •b

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Go to (I3, a) = Closure (A → a•A) = (same as I3)


Go to (I3, b) = Closure (A → b•) = (same as I4)
I4= Go to (I0, b) = closure (A → b•) = A → b•
I5= Go to (I2, A) = Closure (S → AA•) = SA → A•
I6= Go to (I3, A) = Closure (A → aA•) = A → aA•
Drawing DFA:
The DFA contains the 7 states I0 to I6.

LR(0) Table
o If a state is going to some other state on a terminal then it correspond to a shift move.
o If a state is going to some other state on a variable then it correspond to go to move.
o If a state contain the final item in the particular row then write the reduce node
completely.

Explanation:
o I0 on S is going to I1 so write it as 1.
o I0 on A is going to I2 so write it as 2.
o I2 on A is going to I5 so write it as 5.

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

o I3 on A is going to I6 so write it as 6.
o I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
o I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
o I4, I5 and I6 all states contains the final item because they contain • in the right most end.
So rate the production as production number.
Productions are numbered as follows:
S → AA ... (1)
A → aA ... (2)
A → b ... (3)
o I1 contains the final item which drives(S` → S•), so action {I1, $} = Accept.
o I4 contains the final item which drives A → b• and that production corresponds to the
production number 3 so write it as r3 in the entire row.
o I5 contains the final item which drives S → AA• and that production corresponds to the
production number 1 so write it as r1 in the entire row.
o I6 contains the final item which drives A → aA• and that production corresponds to the
production number 2 so write it as r2 in the entire row.

SLR (1) Parsing


SLR (1) refers to simple LR Parsing. It is same as LR(0) parsing. The only difference is
in the parsing table.To construct SLR (1) parsing table, we use canonical collection of LR (0)
item.
In the SLR (1) parsing, we place the reduce move only in the follow of left hand side.
Various steps involved in the SLR (1) Parsing:
o For the given input string write a context free grammar
o Check the ambiguity of the grammar
o Add Augment production in the given grammar
o Create Canonical collection of LR (0) items
o Draw a data flow diagram (DFA)
o Construct a SLR (1) parsing table
SLR (1) Table Construction
The steps which use to construct SLR (1) Table is given below:
If a state (Ii) is going to some other state (Ij) on a terminal then it corresponds to a shift
move in the action part.

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

If a state (Ii) is going to some other state (Ij) on a variable then it correspond to go to
move in the Go to part.

If a state (Ii) contains the final item like A → ab• which has no transitions to the next
state then the production is known as reduce production. For all terminals X in FOLLOW (A),
write the reduce entry along with their production numbers.
Example
S -> •Aa
A->αβ•
Follow(S) = {$}
Follow (A) = {a}

SLR ( 1 ) Grammar
S→E
E→E+T|T

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

T→T*F|F
F → id
Add Augment Production and insert '•' symbol at the first position for every production in G
S` → •E
E → •E + T
E → •T
T → •T * F
T → •F
F → •id
I0 State:
Add Augment production to the I0 State and Compute the Closure
I0 = Closure (S` → •E)
Add all productions starting with E in to I0 State because "." is followed by the non-terminal. So,
the I0 State becomes
I0 = S` → •E
E → •E + T
E → •T
Add all productions starting with T and F in modified I0 State because "." is followed by the
non-terminal. So, the I0 State becomes.
I0= S` → •E
E → •E + T
E → •T
T → •T * F
T → •F
F → •id
I1= Go to (I0, E) = closure (S` → E•, E → E• + T)
I2= Go to (I0, T) = closure (E → T•T, T• → * F)
I3= Go to (I0, F) = Closure ( T → F• ) = T → F•
I4= Go to (I0, id) = closure ( F → id•) = F → id•
I5= Go to (I1, +) = Closure (E → E +•T)
Add all productions starting with T and F in I5 State because "." is followed by the non-terminal.
So, the I5 State becomes
I5 = E → E +•T
T → •T * F
T → •F
F → •id
Go to (I5, F) = Closure (T → F•) = (same as I3)
Go to (I5, id) = Closure (F → id•) = (same as I4)
I6= Go to (I2, *) = Closure (T → T * •F)

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Add all productions starting with F in I6 State because "." is followed by the non-terminal. So,
the I6 State becomes
I6 = T → T * •F
F → •id
Go to (I6, id) = Closure (F → id•) = (same as I4)
I7= Go to (I5, T) = Closure (E → E + T•) = E → E + T•
I8= Go to (I6, F) = Closure (T → T * F•) = T → T * F•
Drawing DFA:

SLR (1) Table

Explanation:
First (E) = First (E + T) ∪ First (T)
First (T) = First (T * F) ∪ First (F)
First (F) = {id}

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

First (T) = {id}


First (E) = {id}
Follow (E) = First (+T) ∪ {$} = {+, $}
Follow (T) = First (*F) ∪ First (F)
= {*, +, $}
Follow (F) = {*, +, $}
o I1 contains the final item which drives S → E• and follow (S) = {$}, so action {I1, $} =
Accept
o I2 contains the final item which drives E → T• and follow (E) = {+, $}, so action {I2, +}
= R2, action {I2, $} = R2
o I3 contains the final item which drives T → F• and follow (T) = {+, *, $}, so action {I3,
+} = R4, action {I3, *} = R4, action {I3, $} = R4
o I4 contains the final item which drives F → id• and follow (F) = {+, *, $}, so action {I4,
+} = R5, action {I4, *} = R5, action {I4, $} = R5
o I7 contains the final item which drives E → E + T• and follow (E) = {+, $}, so action {I7,
+} = R1, action {I7, $} = R1
o I8 contains the final item which drives T → T * F• and follow (T) = {+, *, $}, so action
{I8, +} = R3, action {I8, *} = R3, action {I8, $} = R3.

CLR (1) Parsing


CLR refers to canonical lookahead. CLR parsing use the canonical collection of LR (1)
items to build the CLR (1) parsing table. CLR (1) parsing table produces the more number of
states as compare to the SLR (1) parsing.
In the CLR (1), we place the reduce node only in the lookahead symbols.
Various steps involved in the CLR (1) Parsing:
o For the given input string write a context free grammar
o Check the ambiguity of the grammar
o Add Augment production in the given grammar
o Create Canonical collection of LR (0) items
o Draw a data flow diagram (DFA)
o Construct a CLR (1) parsing table
LR (1) item
LR (1) item is a collection of LR (0) items and a look ahead symbol.
LR (1) item = LR (0) item + look ahead
The look ahead is used to determine that where we place the final item.
The look ahead always add $ symbol for the argument production.
Example
CLR ( 1 ) Grammar
S → AA
A → aA
A→b

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Add Augment Production, insert '•' symbol at the first position for every production in G and
also add the lookahead.
S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
I0 State:
Add Augment production to the I0 State and Compute the Closure
I0 = Closure (S` → •S)
Add all productions starting with S in to I0 State because "." is followed by the non-terminal. So,
the I0 State becomes
I0 = S` → •S, $
S → •AA, $
Add all productions starting with A in modified I0 State because "." is followed by the non-
terminal. So, the I0 State becomes.
I0= S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
I1= Go to (I0, S) = closure (S` → S•, $) = S` → S•, $
I2= Go to (I0, A) = closure ( S → A•A, $ )
Add all productions starting with A in I2 State because "." is followed by the non-terminal. So,
the I2 State becomes
I2= S → A•A, $
A → •aA, $
A → •b, $
I3= Go to (I0, a) = Closure ( A → a•A, a/b )
Add all productions starting with A in I3 State because "." is followed by the non-terminal. So,
the I3 State becomes
I3= A → a•A, a/b
A → •aA, a/b
A → •b, a/b
Go to (I3, a) = Closure (A → a•A, a/b) = (same as I3)
Go to (I3, b) = Closure (A → b•, a/b) = (same as I4)
I4= Go to (I0, b) = closure ( A → b•, a/b) = A → b•, a/b
I5= Go to (I2, A) = Closure (S → AA•, $) =S → AA•, $
I6= Go to (I2, a) = Closure (A → a•A, $)
Add all productions starting with A in I6 State because "." is followed by the non-terminal. So,
the I6 State becomes

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

I6 = A → a•A, $
A → •aA, $
A → •b, $
Go to (I6, a) = Closure (A → a•A, $) = (same as I6)
Go to (I6, b) = Closure (A → b•, $) = (same as I7)
I7= Go to (I2, b) = Closure (A → b•, $) = A → b•, $
I8= Go to (I3, A) = Closure (A → aA•, a/b) = A → aA•, a/b
I9= Go to (I6, A) = Closure (A → aA•, $) = A → aA•, $
Drawing DFA:

CLR (1) Parsing table:

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

Productions are numbered as follows:


S → AA ... (1)
A → aA ....(2)
A → b ... (3)
The placement of shift node in CLR (1) parsing table is same as the SLR (1) parsing
table. Only difference in the placement of reduce node.
I4 contains the final item which drives ( A → b•, a/b), so action {I4, a} = R3, action {I4, b} =
R3.
I5 contains the final item which drives ( S → AA•, $), so action {I5, $} = R1.
I7 contains the final item which drives ( A → b•,$), so action {I7, $} = R3.
I8 contains the final item which drives ( A → aA•, a/b), so action {I8, a} = R2, action {I8, b} =
R2.
I9 contains the final item which drives ( A → aA•, $), so action {I9, $} = R2.

LALR (1) Parsing:


LALR refers to the lookahead LR. To construct the LALR (1) parsing table, we use the
canonical collection of LR (1) items.
In the LALR (1) parsing, the LR (1) items which have same productions but different
look ahead are combined to form a single set of items
LALR (1) parsing is same as the CLR (1) parsing, only difference in the parsing table.
Example
LALR ( 1 ) Grammar
S → AA
A → aA
A→b
Add Augment Production, insert '•' symbol at the first position for every production in G and
also add the look ahead.
S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
I0 State:
Add Augment production to the I0 State and Compute the Closure L
I0 = Closure (S` → •S)
Add all productions starting with S in to I0 State because "•" is followed by the non-terminal. So,
the I0 State becomes
I0 = S` → •S, $
S → •AA, $
Add all productions starting with A in modified I0 State because "•" is followed by the non-
terminal. So, the I0 State becomes.
I0= S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

I1= Go to (I0, S) = closure (S` → S•, $) = S` → S•, $


I2= Go to (I0, A) = closure ( S → A•A, $ )
Add all productions starting with A in I2 State because "•" is followed by the non-terminal. So,
the I2 State becomes
I2= S → A•A, $
A → •aA, $
A → •b, $
I3= Go to (I0, a) = Closure ( A → a•A, a/b )
Add all productions starting with A in I3 State because "•" is followed by the non-terminal. So,
the I3 State becomes
I3= A → a•A, a/b
A → •aA, a/b
A → •b, a/b
Go to (I3, a) = Closure (A → a•A, a/b) = (same as I3)
Go to (I3, b) = Closure (A → b•, a/b) = (same as I4)
I4= Go to (I0, b) = closure ( A → b•, a/b) = A → b•, a/b
I5= Go to (I2, A) = Closure (S → AA•, $) =S → AA•, $
I6= Go to (I2, a) = Closure (A → a•A, $)
Add all productions starting with A in I6 State because "•" is followed by the non-terminal. So,
the I6 State becomes
I6 = A → a•A, $
A → •aA, $
A → •b, $
Go to (I6, a) = Closure (A → a•A, $) = (same as I6)
Go to (I6, b) = Closure (A → b•, $) = (same as I7)
I7= Go to (I2, b) = Closure (A → b•, $) = A → b•, $
I8= Go to (I3, A) = Closure (A → aA•, a/b) = A → aA•, a/b
I9= Go to (I6, A) = Closure (A → aA•, $) A → aA•, $
If we analyze then LR (0) items of I3 and I6 are same but they differ only in their lookahead.
I3 = { A → a•A, a/b
A → •aA, a/b
A → •b, a/b
}
I6= { A → a•A, $
A → •aA, $
A → •b, $
}
Clearly I3 and I6 are same in their LR (0) items but differ in their lookahead, so we can combine
them and called as I36.
I36 = { A → a•A, a/b/$
A → •aA, a/b/$
A → •b, a/b/$
}
The I4 and I7 are same but they differ only in their look ahead, so we can combine them and
called as I47.
I47 = {A → b•, a/b/$}

Automata Theory & Compiler Design


Sir C R Reddy College of Engineering Department of Information Technology

The I8 and I9 are same but they differ only in their look ahead, so we can combine them and
called as I89.
I89 = {A → aA•, a/b/$}
Drawing DFA:

LALR (1) Parsing table:

Automata Theory & Compiler Design

You might also like