CD ch2
CD ch2
CD ch2
1
LEXICAL ANALYSIS- INTRODUCTION
LEXICAL ANALYZER
Symbol
Table
ROLES OF THE LEXICAL ANALYSER
Pattern: A set of strings in the input for which the same token is
produced as output. This set of strings is described by a rule called a
pattern associated with the token.
TOKENS, LEXEMES AND PATTERNS
Token Lexeme Pattern
(element of a
kind )
ID x y n_0 letter followed by letters
and digits
NUM -123 any numeric constant
1.456e-5
IF if if
LPAREN ( (
LITERAL ``Hello'' any string of characters
(except ``) between `` and ``
Concatenation:
L1L2 = { s1s2 | s1 L1 and s2 L2 }
Union
L1 L2 = { s | s L1 or s L2 }
Exponentiation:
L0 = {} L1 = L L2 = LL
Kleene Closure
L* = Li
i =0
Positive Closure
L+ = L
i
i =1
EXAMPLE
L1 = {a,b,c,d} L2 = {1,2}
L1L2 = {a1,a2,b1,b2,c1,c2,d1,d2}
L1 L2 = {a,b,c,d,1,2}
(r)+ = (r)(r)*
(r)? = (r) |
REGULAR EXPRESSIONS (CONT.)
We may remove parentheses by using precedence rules.
* highest
concatenation next
| lowest
ab*|c means (a(b)*)|(c)
Ex:
= {0,1}
0|1 => {0,1}
(0|1)(0|1) => {00,01,10,11}
0* => { ,0,00,000,0000,....}
(0|1)* => all strings with 0 and 1, including the empty string
REGULAR DEFINITIONS
To write regular expression for some languages can be difficult,
because their regular expressions can be quite complex. In those cases,
we may use regular definitions.
We can give names to regular expressions and we can use these names
as symbols to define other regular expressions.
r+ = rr*
r? = r│ε
[a-z] = a │ b │ c │ … │ z
Examples:
digit → [0-9]
digits → digit+
optional_fraction → (. digits)?
optional_exponent → ( E (+ │ -)? digit+ )?
num → digits optional_fraction optional_exponent
RECOGNITION OF TOKENS
e.g. Regular Definitions
stmt → if expr then stmt if → if
│ if expr then stmt else stmtthen → then
│ ε else → else
expr → term relop term relop → < │ <= │ = │ <> │ > │ >=
│ term id → letter (letter │digit)*
term → id num →digits optional_fraction
│ num optional_exponent
Assumptions
delim → blank │tab │newline
TRANSITION DIAGRAMS
relop → <<=<>>>==
start < =
0 1 2 return(relop, LE)
>
3 return(relop, NE)
other
4 * return(relop, LT)
=
5 return(relop, EQ)
> =
6 7 return(relop, GE)
other
8 * return(relop, GT)
id → letter ( letterdigit )* letter or digit
lex
source lex or flex lex.yy.c
program compiler
lex.l
lex.yy.c C a.out
compiler
input sequence
stream a.out of tokens
LEX SPECIFICATION
a
q0 q2f
m/c for a+
CONCATENATION OPERATION
Concatenation means joining (a.b)
Important Note: a.b ≠ b.a i.e. order of join will change the design of automata
a
q0 qq2f
m/c for a
b
m/c for b q0 q2f
b b a
a
q0 q1 qq2f q0 q1 qqq2f
f
a
q0 qq2f
m/c for a
b
q0 q2f
m/c for b
a q2f
q0
m/c for a/b
b q2f
SECTION 1.2
INTRODUCTION TO FINITE AUTOMATA
FINITE AUTOMATA
Automata means machine
Finite Automata consist of 5 tuples:
M = (Q, Σ, δ, q0, F)
Q A finite set of states
Σ A finite set of input alphabet
δ A transition function
q0 The initial/starting state, q0 is in Q
F A set of final/accepting states, which is a subset of F
TYPES OF AUTOMATA
b Here Σ = { a, b} and at
every state there is one
a
q1 O/P from ‘a’ and one
q0 a, O/P from ‘b’. None of
b a b the states have more
b
q2 then one output
corresponding to a or
qf
a b.
NON-DETERMINISTIC FINITE AUTOMATA
b
Here state q0 has two
a moves from a, one to
q0 q1
q1 and other to q2,
a b like wise state q2 has
a two moves on ‘b’ one
b q2
qf self loop to q1 and
b another to qf
TYPES OF NFA
There are two type of NFA
a,
b
a q1
q0
a ε
a,b
qf
DIFFERENCE BETWEEN DFA AND NFA
Deterministic Finite Non-Deterministic Finite
Automata Automata
i) Star operation
ii) Concatenation
iii) OR operation
For each operation we have defined rules to build a NFA with ε-move
Thompson’s Construction for Star Operation
qf
NFA for a*
NFA for a* using Thomson’s Construction:
ε ε
q0 q1 q2 qf
a
ε
Thompson’s Construction for Star Operation
ε
ε
Single a
ε ε
q0 q1 q2 qf
a
ε
Thompson’s Construction for Star Operation
ε
ε N number of a’s
q0→q1→q2→q1→q2→qf
ε ε q1→q2→q1 loops for N
q0 q1 q2 qf
a times where N varies from
2 to ∞
ε
THOMPSON’S CONSTRUCTION FOR CONCATENATION
OPERATION
a
NFA for a q0 qf
b
NFA for b q0 qf
a b
q0 q1 qf
THOMPSON’S CONSTRUCTION FOR OR OPERATION
a
NFA for a q0 qf
b
NFA for b q0 qf
a ε
ε q1 q2
q0 qf
ε b q4 ε
q3
THOMPSON’S CONSTRUCTION FOR AA*B Question 1
a
Thompson’s for a: q0 qf
b
Thompson’s for b: q0 qf
ε
Thompson’s for a*: ε ε
q0 q1 q2 qf
a
ε
THOMPSON’S CONSTRUCTION FOR a*b(a/b)
Question 1
Thompson’s Construction for aa*b:
ε
a ε ε b
q0 q1 q2 q3 q4 qf
a
ε
NFA using Thompson’s Construction
a
a
q0 q1 qf
b
NFA without Thompson’s
THOMPSON’S CONSTRUCTION FOR a*b(a/b)
Question 2
ε
Thompson’s for a*: ε ε
q0 q1 q2 qf
a
ε
b
Thompson’s for b: q0 qf
a ε
ε q1 q2
Thompson’s for a/b: q0 qf
ε b q4 ε
q3
THOMPSON’S CONSTRUCTION FOR a*b(a/b)
Question 2
NFA using Thompson’s Construction
ε a ε
ε q5 q6
ε ε b qf
q0 q1 q2 q3 q4
a b
ε q7 q8 ε
ε
b a,b
q0 q1 qf
a
ε q1 q2 ε
b qf
q0 ε q4 q6 ε
ε q3 q8 ε
ε c
q5 q7 ε
Final Output
THOMPSON’S CONSTRUCTION FOR ab(a/b)*
Question 4
a ε
ε q1 q2
Thompson’s for a/b: q0 qf
ε b q4 ε
q3
ε
THOMPSON’S CONSTRUCTION FOR ab(a/b)* Question 4
a
Thompson’s for a: q0 qf
b
Thompson’s for b: q0 qf
ε
THOMPSON’S CONSTRUCTION FOR ab(a/b)*
Question 4
ε
a ε
ε q4 q5
a b ε q3 q8 ε
q0 q1 q2 qf
ε b ε
q6 q7
ε
NFA using Thompson’s Construction
a,b
a b
q0 q1 qf
➢ First step is to take ε-Closure of the start state , for e.g. if the start
state is 0 so take ε-Closure(0).
➢ Most Imp.- “ε-Closure of a state will include that state itself in the
set”, i.e. ε-Closure(n) will include n in its set of states.
SUBSET CONSTRUCTION FOR (a/b)*ab
ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b
4 5 ε
ε
State a b
Start with the start state: state 0
A
ε-closure(0):{0,1,2,4,7} = A
(0,1,2,4,7)
SUBSET CONSTRUCTION FOR (a/b)*ab
ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b
4 5 ε
ε
ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b 5 ε
4
ε
State a b
(A, a)= ε -closure (3) ⋃ ε -closure (8) A B
= {1,2,3,4,6,7} U {8} (0,1,2,4,7) (1,2,3,4,6,7,8
)
= {1,2,3,4,6,7,8}=B
SUBSET CONSTRUCTION FOR (a/b)*ab
ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b 5 ε
4
ε
State a b
(A, b)= ({0,1,2,4,7}, b) A B
={0,b} ⋃{1,b} ⋃{2,b}⋃{4,b} ⋃{7,b} (0,1,2,4,7) (1,2,3,4,6,7,8)
= Φ ⋃ Φ ⋃ Φ ⋃{5} ⋃ Φ
= ε -closure (5)
SUBSET CONSTRUCTION FOR (a/b)*ab
ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b
4 5 ε
ε
State a b
(A, b)= ε -closure (5) A B C
= {1,2,4,5,6,7}=C (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)
SUBSET CONSTRUCTION FOR (a/b)*ab
ε
a ε
ε 2 3
ε ε a b
1 6 7 8
0 9
ε b
4 5 ε
ε
ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b ε
4 5
ε
ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b
4 5 ε
ε
B B D
(1,2,4,5,6,7,9)
SUBSET CONSTRUCTION FOR(a/b)*ab
ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b
4 5 ε
ε
ε
a ε
ε 2 3
ε ε a b
0
1 6 7 8 9
ε b 5 ε
4
ε
ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b
4 5 ε
ε
b
C State a b
b A B C
b a (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)
B B D
a B
A (1,2,4,5,6,7,9)
C B C
a a
b D B C
qD2
➢ Here state A is start state since set
‘A’ has state ‘0’ in its subset which is
Final Output start state in the NFA with
Thompson’s construction.
➢ D is final state since the set D has
state ‘9’ which is final state in the
NFA with Thompson’s Construction
Ε-CLOSURE(T)
concatenation
#
6
b
closure 5
b
4
a
* 3
alternation
| position
number
a b (for leafs )
1 2
FROM REGULAR EXPRESSION TO DFA DIRECTLY:
ANNOTATING THE TREE
Leaf true
{1, 2} | {1, 2}
Node followpos
b b
1 {1, 2, 3}
a
2 {1, 2, 3}
start a 1,2, b 1,2, b 1,2,
3 {4} 1,2,3
3,4 3,5 3,6
4 {5} a
5 {6} a
6 -
DIFFERENT DFA’S FOR (a/b)*abb
b
State a b C b
A B C
b a
B B D
a
a b
C B C A B D EE
b
a
D B E a
E B C
b b State a b
a A B A
start a 1,2, b 1,2, b 1,2,
1,2,3 B B C
3,4 3,5 3,6
C A D
a
D B A
FROM REGULAR EXPRESSION TO DFA DIRECTLY:
FOLLOWPOS
for each node n in the tree do
if n is a cat-node with left child c1 and right child c2 then
for each i in lastpos(c1) do
followpos(i) := followpos(i) firstpos(c2)
end do
else if n is a star-node
for each i in lastpos(n) do
followpos(i) := followpos(i) firstpos(n)
end do
end if
end do
FROM REGULAR EXPRESSION TO DFA DIRECTLY:
ALGORITHM
a B
A
b a
a a
b C b
D E
b
b
USING FINAL AND NON FINAL STATE
Divide the entire set of states into two subsets: Set of final
States and set of non final states.
a B
A Stat a b
b a → e
a a A B C
b C b
B B D
D E
b C B C
D B E
b *
E B C
State a b
→ A B C
B B D
Set of non Final States (NF): {A,B,C, D} C B C
Set of Final States (F): {E} D B E
* E B C
Question 1
NF= {A,B,C,D}
State a b F= {E}
→ A B C A,B,C
,D
B B D
C B C
E
D B E
* E B C
Question 1
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,B,C,D) with Σ=b
A,B,C
,D NF= ({A,B,C} {D})
F= {E}
State a b
→ A B C A,B,C D
B B D b
Split into two since
C B C
E {A,B,C} goes on
D B E states within {A,B,C)
while state D goes to
* E B C State {E}
Question 1
A,B,C
B NF= ({A,C}, {B}
State a b {D})
→ A B C A,C b
B B D
D
C B C Split into two since
{A,C} goes to state
D B E
E {C} while {B} goes
* E B C to State {D} which is
already separated.
Question 1
NO SPLIT
a b a
a B a
A A, B
C
b a
a a a a
b C b b
D E E
b D b
b b
Final Output
Question 2
MINIMIZATION THE FOLLOWING DFA, IF POSSIBLE
b
a
a b a
A B C D
a
a b
b b
b b a
E F G H
b a
a
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
b
a
State a b
a b a
C D → A B F
A B
a B G C
a b C A C
b b *
D C G
b b a E H F
E F G H
F C G
b a G G E
a H G C
State a b NO SPLIT
→ A B F
B G C
* C A C A,B,E, D,F
D C G G,H
E H F
a
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,B,E,G,H) with Σ=b
NF= {A,E},{G},{B,H},{D,F}
State a b A,B,E,
→ A B F G,H
B G C D,F
* C A C A,E
B,H
D C G b
E H F G
F C G b
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,E) with Σ=a
State a b NO SPLIT
→ A B F NF= {A,E},{G},{B,H},{D,F}
B G C
* C A C A,E
B,H
D C G
E H F G
D,F
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,E) with Σ=b
NO SPLIT
State a b
→ A B F NF= {A,E},{G},{B,H},{D,F}
B G C D,F
* C A C A,E
B,H
D C G
E H F G
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (B,H) with Σ=a
State a b NO SPLIT
→ A B F
B G C NF= {A,E},{G},{B,H},{D,F}
* C A C A,E
B,H
D C G
E H F
G D,F
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (B,H) with Σ=b
State a b NO SPLIT
→ A B F
B G C NF= {A,E},{G},{B,H},{D,F}
* C A C A,E
B,H
D C G
E H F
G D,F
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (D,F) with Σ=a
State a b NO SPLIT
→ A B F NF= {A,E},{G},{B,H},{D,F}
B G C
* C A C A,E
B,H
D C G
E H F
G D,F
F C G a
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (D,F) with Σ=b
State a b NO SPLIT
→ A B F NF= {A,E},{G},{B,H},{D,F}
B G C
* C A C A,E
B,H D,F
D C G
E H F
F C G G
G G E C
H G C
State a b State a b
Final Output
THANKS