Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CD ch2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 104

SECTION 1.

1
LEXICAL ANALYSIS- INTRODUCTION
LEXICAL ANALYZER

 Lexical Analyzer reads the source program character by character to


produce tokens.
 Normally a lexical analyzer doesn’t return a list of tokens at one shot,
it returns a token when the parser asks a token from it.

source Lexical token


program Parser
Analyzer get next token

Symbol
Table
ROLES OF THE LEXICAL ANALYSER

Lexical analyzer performs following tasks:


 Helps to identify token in the symbol table

 Removes white spaces and comments from the source program

 Correlates error messages with the source program

 Helps you to expands the macros if it is found in the source program

 Read input characters from the source program


TOKENS, LEXEMES AND PATTERNS

 Token: Token is a sequence of characters that can be treated as a


single logical entity. Typical tokens are:
Identifiers 2) keywords 3) operators 4) special symbols 5)constants

 Lexeme: A lexeme is a sequence of characters in the source program


that is matched by the pattern for a token.

 Pattern: A set of strings in the input for which the same token is
produced as output. This set of strings is described by a rule called a
pattern associated with the token.
TOKENS, LEXEMES AND PATTERNS
Token Lexeme Pattern
(element of a
kind )
ID x y n_0 letter followed by letters
and digits
NUM -123 any numeric constant
1.456e-5
IF if if
LPAREN ( (
LITERAL ``Hello'' any string of characters
(except ``) between `` and ``

 Regular expressions are widely used to specify patterns.


EXAMPLE #include <stdio.h>
int maximum(int x, int y){
// This will compare 2 numbers
Tokens Generated
Lexeme Token
int Keyword
maximu Identifier
m Type Examples
( Operator Comment // This will compare
2 numbers
int Keyword
Pre- #include <stdio.h>
x Identifier processor
directive
, Operator
Whitespace /n /b /t
int Keyword
Non-Tokens
Y Identifier
) Operator
{ Operator
TERMINOLOGY OF LANGUAGES

 Alphabet : a finite set of symbols (ASCII characters)


 String :
 Finite sequence of symbols on an alphabet
 Sentence and word are also used in terms of string
  is the empty string
 |s| is the length of string s.
 Language: sets of strings over some fixed alphabet
  the empty set is a language.
 {} the set containing empty string is a language
 The set of well-formed C programs is a language
 The set of all possible identifiers is a language.
 Operators on Strings:
 Concatenation: xy represents the concatenation of strings x and y.
OPERATIONS ON LANGUAGES

 Concatenation:
 L1L2 = { s1s2 | s1  L1 and s2  L2 }
 Union
 L1  L2 = { s | s  L1 or s  L2 }
 Exponentiation:
 L0 = {} L1 = L L2 = LL
 Kleene Closure

 L* = Li
i =0

 Positive Closure

L+ =  L
i
 i =1
EXAMPLE
 L1 = {a,b,c,d} L2 = {1,2}

 L1L2 = {a1,a2,b1,b2,c1,c2,d1,d2}

 L1  L2 = {a,b,c,d,1,2}

 L13 = all strings with length three (using a,b,c,d)

 L1* = all strings using letters a,b,c,d and empty string

 L1+ = doesn’t include the empty string


REGULAR EXPRESSIONS

 We use regular expressions to describe tokens of a programming


language.

 A regular expression is built up of simpler regular expressions


(using defining rules)

 Each regular expression denotes a language.

 A language denoted by a regular expression is called as a


regular set.
REGULAR EXPRESSIONS (RULES)
Regular expressions over alphabet 

Reg. Expr Language it denotes


 {}
a  {a}
(r1) | (r2) L(r1)  L(r2)
(r1) (r2) L(r1) L(r2)
(r)* (L(r))*
(r) L(r)

 (r)+ = (r)(r)*
 (r)? = (r) | 
REGULAR EXPRESSIONS (CONT.)
 We may remove parentheses by using precedence rules.
 * highest
 concatenation next
 | lowest
 ab*|c means (a(b)*)|(c)

 Ex:
  = {0,1}
 0|1 => {0,1}
 (0|1)(0|1) => {00,01,10,11}
 0* => { ,0,00,000,0000,....}
 (0|1)* => all strings with 0 and 1, including the empty string
REGULAR DEFINITIONS
 To write regular expression for some languages can be difficult,
because their regular expressions can be quite complex. In those cases,
we may use regular definitions.
 We can give names to regular expressions and we can use these names
as symbols to define other regular expressions.

 A regular definition is a sequence of the definitions of the form:


d1 → r1 where di is a distinct name and
d2 → r2 ri is a regular expression over symbols in
. {d1,d2,...,di-1}
dn → rn
basic symbols previously defined names
REGULAR DEFINITIONS (CONT.)

 Ex: Identifiers in Pascal


letter → A | B | ... | Z | a | b | ... | z
digit → 0 | 1 | ... | 9
id → letter (letter | digit ) *
 If we try to write the regular expression representing identifiers without using regular
definitions, that regular expression will be complex.
(A|...|Z|a|...|z) ( (A|...|Z|a|...|z) | (0|...|9) ) *

 Ex: Unsigned numbers in Pascal


digit → 0 | 1 | ... | 9
digits → digit +
opt-fraction → ( . digits ) ?
opt-exponent → ( E (+|-)? digits ) ?
unsigned-num → digits opt-fraction opt-exponent
NOTATIONAL SHORTHAND
 The following shorthand are often used:

r+ = rr*
r? = r│ε
[a-z] = a │ b │ c │ … │ z

 Examples:
digit → [0-9]
digits → digit+
optional_fraction → (. digits)?
optional_exponent → ( E (+ │ -)? digit+ )?
num → digits optional_fraction optional_exponent
RECOGNITION OF TOKENS
 e.g. Regular Definitions
stmt → if expr then stmt if → if
│ if expr then stmt else stmtthen → then
│ ε else → else
expr → term relop term relop → < │ <= │ = │ <> │ > │ >=
│ term id → letter (letter │digit)*
term → id num →digits optional_fraction
│ num optional_exponent

Assumptions
delim → blank │tab │newline
TRANSITION DIAGRAMS

relop → <<=<>>>==
start < =
0 1 2 return(relop, LE)
>
3 return(relop, NE)
other
4 * return(relop, LT)
=
5 return(relop, EQ)
> =
6 7 return(relop, GE)
other
8 * return(relop, GT)
id → letter ( letterdigit )* letter or digit

start letter other


9 10 11 * return(gettoken(),
install_id())
TRANSITION DIAGRAMS: CODE
 token nexttoken()
{ while (1) {
switch (state) {
case 0: c = nextchar();
if (c==blank || c==tab || c==newline) { Decides the
state = 0;
lexeme_beginning++; next start state
}
else if (c==‘<’) state = 1; to check
 else if (c==‘=’) state = 5;
else if (c==‘>’) state = 6;
else state = fail();
int fail()
break;
{ forward = token_beginning;
case 1:
swith (start) {

case 0: start = 9; break;
case 9: c = nextchar();
case 9: start = 12; break;
if (isletter(c)) state = 10;
case 12: start = 20; break;
else state = fail();
case 20: start = 25; break;
break;
case 25: recover(); break;
case 10: c = nextchar();
default: /* error */
if (isletter(c)) state = 10;
}
else if (isdigit(c)) state = 10;
return start;
else state = 11;
}
break;

THE LEX AND FLEX SCANNER GENERATORS

 Lex and its newer cousin flex are scanner generators

 Systematically translate regular definitions into C source code


for efficient scanning

 Generated code is easy to integrate in C applications


CREATING A LEXICAL ANALYZER WITH LEX AND FLEX

lex
source lex or flex lex.yy.c
program compiler
lex.l

lex.yy.c C a.out
compiler

input sequence
stream a.out of tokens
LEX SPECIFICATION

 A lex specification consists of three parts:


regular definitions, C declarations in %{ %}
%%
translation rules
%%
user-defined auxiliary procedures
 The translation rules are of the form:
p1 { action1 }
p2 { action2 }

pn { actionn }
REGULAR EXPRESSIONS IN LEX
x match the character x
\. match the character .
“string”match contents of string of characters
. match any character except newline
^ match beginning of a line
$ match the end of a line
[xyz] match one character x, y, or z (use \ to escape -)
[^xyz]match any character except x, y, and z
[a-z] match one of a to z
r* closure (match zero or more occurrences)
r+ positive closure (match one or more occurrences)
r? optional (match zero or one occurrence)
r1 r2 match r1 then r2 (concatenation)
r1|r2 match r1 or r2 (union)
(r) grouping
r1\r2 match r1 when followed by r2
{d} match the regular expression defined by d
STAR OPERATION (KLEENE CLOSURE)
a* = {a0, a1, a2, a3, a4,…. a∞} ={ε, a, aa, aaa, aaaa,….. a∞}
Important Characteristics
➢ Value of * ranges from 0 to ∞ i.e. the elements of set a* will include {a0, a1, a2, a3, a4,
a5…. a∞}
➢ a0 means zero number of a’s and this is represented by ε.
➢ * is represented in finite automata by a loop on that particular state; if value of a is 3
i.e. a3 loop iterates for 3 times.
➢ If value of a is 0 i.e. a0 loop will not iterate at all.

q2f m/c for a*


POSITIVE CLOSURE
a+ = {a1, a2, a3, a4,…., a ∞} = { a, aa, aaa, aaaa,….. a ∞}
Important Characteristics
➢ value of + ranges from 1 to ∞ i.e. the elements of set a+ will include {a1, a2, a3, a4,
a5…. a ∞}
➢ There is no a0 move i.e. ε is not part of this set.
➢ Value of a will start from 1 i.e. at least one will come which can be followed by 0 or
more 1’s.
➢ Please remember: a+ = a.a* a

a
q0 q2f

m/c for a+
CONCATENATION OPERATION
Concatenation means joining (a.b)
Important Note: a.b ≠ b.a i.e. order of join will change the design of automata

a
q0 qq2f
m/c for a

b
m/c for b q0 q2f

b b a
a
q0 q1 qq2f q0 q1 qqq2f
f

m/c for a.b m/c for b.a


OR OPERATION

a
q0 qq2f
m/c for a
b
q0 q2f
m/c for b

NFA for a+b (a/b)

a q2f

q0
m/c for a/b
b q2f
SECTION 1.2
INTRODUCTION TO FINITE AUTOMATA
FINITE AUTOMATA
Automata means machine
Finite Automata consist of 5 tuples:
M = (Q, Σ, δ, q0, F)
Q A finite set of states
Σ A finite set of input alphabet
δ A transition function
q0 The initial/starting state, q0 is in Q
F A set of final/accepting states, which is a subset of F
TYPES OF AUTOMATA

There are two types of finite Automata:

➢ Deterministic Finite Automata (DFA)

➢ Non-deterministic finite Automata (NFA)


DETERMINISTIC FINITE AUTOMATA
Deterministic Finite Automata is a Machine where corresponding to
a every input of Σ, there can be only one output from every state.

b Here Σ = { a, b} and at
every state there is one
a
q1 O/P from ‘a’ and one
q0 a, O/P from ‘b’. None of
b a b the states have more
b
q2 then one output
corresponding to a or
qf
a b.
NON-DETERMINISTIC FINITE AUTOMATA

Non-Deterministic Finite Automata is a machine where corresponding to a single


input of Σ (a,b), there can be more than one output from a particular state.

b
Here state q0 has two
a moves from a, one to
q0 q1
q1 and other to q2,
a b like wise state q2 has
a two moves on ‘b’ one
b q2
qf self loop to q1 and
b another to qf
TYPES OF NFA
There are two type of NFA

i. NFA without ε -move

ii. NFA with ε -move


NFA WITH Ε-MOVE

Consider the following NFA, here corresponding q1 there is an ε-move.

a,
b
a q1
q0

a ε
a,b
qf
DIFFERENCE BETWEEN DFA AND NFA
Deterministic Finite Non-Deterministic Finite
Automata Automata

 Deterministic Finite  Non-Deterministic


Automata is a Machine Finite Automata is a
where corresponding to a machine where
every input of Σ, there corresponding to a
can be only one output single input of Σ (a,b),
from every state. there can be more than
 DFA will not have ε- one output from a
move particular state.
 NFA can have ε-move
SECTION 1.3
THOMSON’S CONSTRUCTION
THOMPSON’S CONSTRUCTION
We have three operations on Regular Expressions:

i) Star operation

ii) Concatenation

iii) OR operation

For each operation we have defined rules to build a NFA with ε-move
Thompson’s Construction for Star Operation

a* = {ε, a, aa, aaa, aaaa,…..} a

qf

NFA for a*
NFA for a* using Thomson’s Construction:

ε ε
q0 q1 q2 qf
a

ε
Thompson’s Construction for Star Operation

NFA for a* using Thomson’s Construction:


ε
Only ε
ε ε
q0 q1 q2 qf
a

ε
ε
Single a
ε ε
q0 q1 q2 qf
a

ε
Thompson’s Construction for Star Operation

NFA for a* using Thomson’s Construction:


ε
Two a’s
ε ε q0→q1→q2→q1→q2→qf
q0 q1 q2 qf
a

ε
ε N number of a’s
q0→q1→q2→q1→q2→qf
ε ε q1→q2→q1 loops for N
q0 q1 q2 qf
a times where N varies from
2 to ∞

ε
THOMPSON’S CONSTRUCTION FOR CONCATENATION
OPERATION

a
NFA for a q0 qf

b
NFA for b q0 qf

NFA for ab using Thomson’s Construction

a b
q0 q1 qf
THOMPSON’S CONSTRUCTION FOR OR OPERATION

a
NFA for a q0 qf

b
NFA for b q0 qf

NFA for a+b (a/b) using Thomson’s Construction

a ε
ε q1 q2

q0 qf

ε b q4 ε
q3
THOMPSON’S CONSTRUCTION FOR AA*B Question 1

a
Thompson’s for a: q0 qf

b
Thompson’s for b: q0 qf

ε
Thompson’s for a*: ε ε
q0 q1 q2 qf
a

ε
THOMPSON’S CONSTRUCTION FOR a*b(a/b)
Question 1
Thompson’s Construction for aa*b:

ε
a ε ε b
q0 q1 q2 q3 q4 qf
a

ε
NFA using Thompson’s Construction

a
a
q0 q1 qf
b
NFA without Thompson’s
THOMPSON’S CONSTRUCTION FOR a*b(a/b)
Question 2
ε
Thompson’s for a*: ε ε
q0 q1 q2 qf
a

ε
b
Thompson’s for b: q0 qf

a ε
ε q1 q2
Thompson’s for a/b: q0 qf

ε b q4 ε
q3
THOMPSON’S CONSTRUCTION FOR a*b(a/b)
Question 2
NFA using Thompson’s Construction

ε a ε
ε q5 q6
ε ε b qf
q0 q1 q2 q3 q4
a b
ε q7 q8 ε
ε

b a,b
q0 q1 qf

NFA without Thompson’s


THOMPSON’S CONSTRUCTION FOR (a/b/c)
ε q1
a
q2 ε Question 3
b ε qf Three ε out moves moves from a
q0 q3 q4
ε state are not allowed
c q6 ε
ε q5

a
ε q1 q2 ε
b qf
q0 ε q4 q6 ε
ε q3 q8 ε

ε c
q5 q7 ε
Final Output
THOMPSON’S CONSTRUCTION FOR ab(a/b)*
Question 4
a ε
ε q1 q2
Thompson’s for a/b: q0 qf

ε b q4 ε
q3

Thompson’s for (a/b)*: ε


a ε
ε q2 q3
ε q1 q6 ε
q0 qf
ε b q5 ε
q4

ε
THOMPSON’S CONSTRUCTION FOR ab(a/b)* Question 4
a
Thompson’s for a: q0 qf

b
Thompson’s for b: q0 qf

Thompson’s for (a/b)*:


ε
a ε
ε q2 q3
ε q1 q6 ε qf
q0
ε b q5 ε
q4

ε
THOMPSON’S CONSTRUCTION FOR ab(a/b)*
Question 4

ε
a ε
ε q4 q5
a b ε q3 q8 ε
q0 q1 q2 qf
ε b ε
q6 q7

ε
NFA using Thompson’s Construction

a,b

a b
q0 q1 qf

NFA without Thompson’s


SECTION 1.4
SUBSET CONSTRUCTION
HOW TO WORK WITH Ε-CLOSURE FUNCTION

Steps for ε-Closure function:

➢ First step is to take ε-Closure of the start state , for e.g. if the start
state is 0 so take ε-Closure(0).

➢ ε-Closure(n) will include set of all the states which can be


traversed from state n without consuming any input i.e. through ε
move only.

➢ Most Imp.- “ε-Closure of a state will include that state itself in the
set”, i.e. ε-Closure(n) will include n in its set of states.
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b
4 5 ε
ε

State a b
Start with the start state: state 0
A
ε-closure(0):{0,1,2,4,7} = A
(0,1,2,4,7)
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b
4 5 ε
ε

Start with the start state:


ε-closure(0):{0,1,2,4,7} = A State a b
(A, a)= ({0,1,2,4,7}, a) = {0,a} ⋃{1,a} ⋃{2,a} ⋃{4,a} ⋃{7,a} A
= Φ ⋃ Φ ⋃{3} ⋃ Φ ⋃ {8} (0,1,2,4,7)

= ε -closure (3) ⋃ ε -closure (8)


SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b 5 ε
4
ε

State a b
(A, a)= ε -closure (3) ⋃ ε -closure (8) A B
= {1,2,3,4,6,7} U {8} (0,1,2,4,7) (1,2,3,4,6,7,8
)
= {1,2,3,4,6,7,8}=B
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b 5 ε
4
ε

State a b
(A, b)= ({0,1,2,4,7}, b) A B
={0,b} ⋃{1,b} ⋃{2,b}⋃{4,b} ⋃{7,b} (0,1,2,4,7) (1,2,3,4,6,7,8)
= Φ ⋃ Φ ⋃ Φ ⋃{5} ⋃ Φ
= ε -closure (5)
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
6 7 8
0 9
ε b
4 5 ε
ε

State a b
(A, b)= ε -closure (5) A B C
= {1,2,4,5,6,7}=C (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε ε a b
1 6 7 8
0 9
ε b
4 5 ε
ε

(B, a)= ({1,2,3,4,6,7,8}, a) State a b


= {1,a}⋃{2,a} ⋃{a,a} ⋃{4,a}⋃{6,a}⋃{7,a} ⋃{8,a} A B C
= Φ ⋃{3} ⋃ Φ ⋃ Φ ⋃ Φ ⋃ {8} ⋃ Φ (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)
= ε -closure (3) ⋃ ε -closure (8)
B B
= {1,2,3,4,6,7,8}=B (Slide No. 55)
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b ε
4 5
ε

(B, b)= ({1,2,4,5,6,7,8}, b) State a b


={1,b} ⋃{2,b} ⋃{4,b} ⋃{5,b} ⋃{6,b} ⋃{7,b} ⋃{8,b A B C
= Φ ⋃ Φ ⋃{5} ⋃ Φ ⋃ Φ ⋃ Φ{9} (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)
= ε -closure (5) ⋃ ε -closure (9)
B B
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b
4 5 ε
ε

(B, b) = ε -closure (5) ⋃ ε -closure (9) State a b


= {1,2,4,5,6,7,9}=D A B C
(0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)

B B D
(1,2,4,5,6,7,9)
SUBSET CONSTRUCTION FOR(a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b
4 5 ε
ε

(C, a)= ({1,2,4,5,6,7}, a) State a b


= {1,a}⋃{2,a}⋃{4,a}⋃{5,a}⋃{6,a} ⋃{7,a} A B C
= Φ ⋃{3} ⋃ Φ ⋃ Φ ⋃ Φ ⋃{8 (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)

= ε -closure (3) ⋃ ε -closure (8)


B B D
= {1,2,3,4,6,7,8}=B (Slide no. 55) (1,2,4,5,6,7,9)
C B
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε ε a b
0
1 6 7 8 9
ε b 5 ε
4
ε

(C, b)= ({1,2,4,5,6,7}, b) State a b


= {1,b} ⋃{2,b} ⋃{4,b}⋃{5,b} ⋃{6,b} ⋃{7,b} A B C
= Φ ⋃ Φ ⋃{5} ⋃ Φ ⋃ Φ ⋃ Φ (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)

= ε -closure (5)= {1,2,4,5,6,7}=C (Slide no. 57)


B B D
(1,2,4,5,6,7,9)
C B C
SUBSET CONSTRUCTION FOR (a/b)*ab
ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b
4 5 ε
ε

(D, a)= ({1,2,4,5,6,7,9}, a) State a b


= {1,a}⋃{2,a}⋃{4,a}⋃{5,a}⋃{6,a} ⋃{7,a} ⋃{9,a} A B C
= Φ ⋃{3} ⋃ Φ ⋃ Φ ⋃ Φ ⋃{8} ⋃ Φ (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)
= ε -closure (3) ⋃ ε -closure (8) B B D
(1,2,4,5,6,7,9)
= {1,2,3,4,6,7,8}=B (Slide no. 55)
C B C
D B
SUBSET CONSTRUCTION FOR (a/b)*ab

ε
a ε
ε 2 3
ε 1 ε a b
0 6 7 8 9
ε b
4 5 ε
ε

(D, b)= ({1,2,4,5,6,7,9}, b) State a b


= {1,b}⋃{2,b}⋃{4,b}⋃{5,b}⋃{6,b} ⋃{7,b} ⋃{9,b} A B C
= Φ ⋃ Φ ⋃{5} ⋃ Φ ⋃ Φ ⋃ Φ ⋃ Φ (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)
= ε -closure (5)= {1,2,4,5,6,7}=C (Slide no. 57) B B D
(1,2,4,5,6,7,9)
C B C
D B C
SUBSET CONSTRUCTION FOR (a/b)*ab

b
C State a b
b A B C
b a (0,1,2,4,7) (1,2,3,4,6,7,8) (1,2,4,5,6,7)
B B D
a B
A (1,2,4,5,6,7,9)
C B C
a a
b D B C

qD2
➢ Here state A is start state since set
‘A’ has state ‘0’ in its subset which is
Final Output start state in the NFA with
Thompson’s construction.
➢ D is final state since the set D has
state ‘9’ which is final state in the
NFA with Thompson’s Construction
Ε-CLOSURE(T)

push all states of T onto stack


initialize ϵ-closure(T) to T
while (stack is not empty) do
begin
pop t, the top element, off stack;
for (each state u with an edge from t to u labelled ϵ do
begin
if (u is not in ϵ-closure(T)) do
begin
add u to ϵ-closure(T)
push u onto stack
end
end
end
CONVERTING A NFA INTO A DFA (SUBSET CONSTRUCTION)
put -closure({s0}) as an unmarked state into the set of DFA (DS)
while (there is one unmarked S1 in DS) do -closure({s0}) is the set of all states can be accessible
from s0 by -transition.
begin
mark S1 set of states to which there is a transition on
for each input symbol a do a from a state s in S1
begin
S2  -closure(move(S1,a))
if (S2 is not in DS) then
add S2 into DS as an unmarked state
transfunc[S1,a]  S2
end
end

 a state S in DS is an accepting state of DFA if a state s in S is an accepting state of


NFA
 the start state of DFA is -closure({s0})
SECTION 1.5
RE TO DFA THROUGH SYNTAX TREE
METHOD OR DIRECT METHOD
CONVERTING REGULAR EXPRESSIONS DIRECTLY TO
DFAS
 Important state
 We may convert a regular expression into a DFA (without creating a
NFA first).
 First we augment the given regular expression by concatenating it
with a special symbol #.
r ➔ (r)# augmented regular expression
 Then, we create a syntax tree for this augmented regular expression.
 In this syntax tree, all alphabet symbols (plus # and the empty
string) in the augmented regular expression will be on the leaves,
and all inner nodes will be the operators in that augmented regular
expression.
 Then each alphabet symbol (plus #) will be numbered (position
numbers).
FROM REGULAR EXPRESSION TO DFA DIRECTLY:
SYNTAX TREE OF (a/b)*abb#

concatenation
#
6
b
closure 5
b
4
a
* 3
alternation
| position
number
a b (for leafs )
1 2
FROM REGULAR EXPRESSION TO DFA DIRECTLY:
ANNOTATING THE TREE

 nullable(n): the subtree at node n generates languages


including the empty string
 firstpos(n): set of positions that can match the first symbol
of a string generated by the subtree at node n
 lastpos(n): the set of positions that can match the last
symbol of a string generated by the subtree at node n
 followpos(i): the set of positions that can follow position i
in the tree
FROM REGULAR EXPRESSION TO DFA
DIRECTLY: ANNOTATING THE TREE

Node n nullable(n) firstpos(n) lastpos(n)

Leaf  true  

Leaf i false {i} {i}

| nullable(c1) firstpos(c1) lastpos(c1)


/ \ or ꓴ ꓴ
c1 c2 nullable(c2) firstpos(c2) lastpos(c2)
if nullable(c1) then if nullable(c2) then
• nullable(c1)
firstpos(c1) ꓴ lastpos(c1) ꓴ
/ \ and
c1 c2 firstpos(c2) lastpos(c2)
nullable(c2)
else firstpos(c1) else lastpos(c2)
*
| true firstpos(c1) lastpos(c1)
c1
FROM REGULAR EXPRESSION TO DFA DIRECTLY:
SYNTAX TREE OF (a/b)*abb#
{1, 2, 3} {6}

{1, 2, 3} {5} {6} # {6}


6

{1, 2, 3} {4} {5} b {5}


nullable 5

{1, 2, 3} {3} {4} b {4}


4

a {3} firstpos lastpos


{1, 2} {1, 2} {3}
* 3

{1, 2} | {1, 2}

{1} a {1} {2} b {2}


1 2
FROM REGULAR EXPRESSION TO DFA DIRECTLY: EXAMPLE

Node followpos (a/b)*a b b #


1 {1, 2, 3}
2 {1, 2, 3} 1 2 34 5 6
3 {4}
4 {5}
5 {6}
6 -
FROM RE TO DFA DIRECTLY
(a/b)*a b b #
Let {1,2,3}=A
A,a ({1,2,3},a) followpos (1) ꓴ {1,2,3,4} B 1 2 34 5 6
followpos(3) Node
Symbol followpos
Name
A,b ({1,2,3},b) followpos (2) {1,2,3} A
1 a {1, 2, 3}
B,a ({1,2,3,4},a followpos (1) ꓴ {1,2,3,4} B 2 b {1, 2, 3}
) followpos(3)
3 a {4}
B,b ({1,2,3,4},b followpos (2) ꓴ {1,2,3,5} C 4 b {5}
) followpos(4) 5 b {6}
C,a ({1,2,3,5},a followpos (1) ꓴ {1,2,3,4} B 6 # -
) followpos(3)
State a b
C,b ({1,2,3,5},b followpos (2) ꓴ {1,2,3,6} D
A B A
) followpos(5)
B B C
D,a ({1,2,3,6},a followpos (1) ꓴ {1,2,3,4} B C B D
) followpos(3)
D B A
D,b ({1,2,3,6},b followpos (2) {1,2,3} A
)
FROM REGULAR EXPRESSION TO DFA DIRECTLY: EXAMPLE

Node followpos
b b
1 {1, 2, 3}
a
2 {1, 2, 3}
start a 1,2, b 1,2, b 1,2,
3 {4} 1,2,3
3,4 3,5 3,6
4 {5} a
5 {6} a
6 -
DIFFERENT DFA’S FOR (a/b)*abb
b

State a b C b

A B C
b a
B B D
a
a b
C B C A B D EE
b
a
D B E a

E B C

b b State a b
a A B A
start a 1,2, b 1,2, b 1,2,
1,2,3 B B C
3,4 3,5 3,6
C A D
a
D B A
FROM REGULAR EXPRESSION TO DFA DIRECTLY:
FOLLOWPOS
for each node n in the tree do
if n is a cat-node with left child c1 and right child c2 then
for each i in lastpos(c1) do
followpos(i) := followpos(i)  firstpos(c2)
end do
else if n is a star-node
for each i in lastpos(n) do
followpos(i) := followpos(i)  firstpos(n)
end do
end if
end do
FROM REGULAR EXPRESSION TO DFA DIRECTLY:
ALGORITHM

s0 := firstpos(root) where root is the root of the syntax tree


Dstates := {s0} and is unmarked
while there is an unmarked state T in Dstates do
mark T
for each input symbol a   do
let U be the set of positions that are in followpos(p)
for some position p in T,
such that the symbol at position p is a
if U is not empty and not in Dstates then
add U as an unmarked state to Dstates
end if
Dtran[T,a] := U
end do
end do
SECTION 1.6
MINIMIZATION OF DFA
Question 1
MINIMIZATION THE FOLLOWING DFA, IF
POSSIBLE

a B
A
b a
a a
b C b
D E
b

b
USING FINAL AND NON FINAL STATE

Divide the entire set of states into two subsets: Set of final
States and set of non final states.

Consider each sub-set as a separate entity and identify if they


need to be split further or can they be combined together
Question 1

DFA MINIMIZATION USING PARTITIONING METHOD

a B
A Stat a b
b a → e
a a A B C
b C b
B B D
D E
b C B C
D B E
b *
E B C

Draw the transition table corresponding to the given DFA


Question 1

DFA MINIMIZATION USING PARTITIONING METHOD

Divide the states into two subsets- final and non-final

State a b
→ A B C
B B D
Set of non Final States (NF): {A,B,C, D} C B C
Set of Final States (F): {E} D B E
* E B C
Question 1

DFA MINIMIZATION USING PARTITIONING METHOD

Check O/P of all clubbed states (A,B,C,D) with Σ=a

NF= {A,B,C,D}
State a b F= {E}
→ A B C A,B,C
,D
B B D
C B C
E
D B E
* E B C
Question 1
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,B,C,D) with Σ=b

A,B,C
,D NF= ({A,B,C} {D})
F= {E}
State a b
→ A B C A,B,C D
B B D b
Split into two since
C B C
E {A,B,C} goes on
D B E states within {A,B,C)
while state D goes to
* E B C State {E}
Question 1

DFA MINIMIZATION USING PARTITIONING METHOD


Check O/P of all clubbed states (A,B,C) with Σ=a

NF= ({A,B,C}, {D})


State a b
→ A B C A,B,C D
B B D
C B C E
NO SPLIT
D B E
* E B C
Question 1
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,B,C) with Σ=b

A,B,C
B NF= ({A,C}, {B}
State a b {D})

→ A B C A,C b
B B D
D
C B C Split into two since
{A,C} goes to state
D B E
E {C} while {B} goes
* E B C to State {D} which is
already separated.
Question 1

DFA MINIMIZATION USING PARTITIONING METHOD


Check O/P of all clubbed states (A,C) with Σ=a

NO SPLIT

B NF= ({A,C}, {B}


State a b {D})
→ A B C A,C D
B B D
C B C E Both A and C go to
state B which is
D B E already separated
* E B C
Question 1

DFA MINIMIZATION USING PARTITIONING METHOD


Check O/P of all clubbed states (A,C) with Σ=b
NO SPLIT

NF= ({A,C}, {B}


{D})
B
State a b Both A and C state
→ go to same group
A B C A,C D {A,C} on Σ=b
B B D
Since subset {A,C}
C B C E remain as single
D B E combined state till
end, both states will
* E B C
be joined together as a
single state
State a b State a b
DFA MINIMIZATION → A B C A,C B A,C

USING PARTITIONING METHOD B B D B B D
C B C D B E
D B E * E B A,C
* E B C

a b a
a B a
A A, B
C
b a
a a a a
b C b b
D E E
b D b

b b

Final Output
Question 2
MINIMIZATION THE FOLLOWING DFA, IF POSSIBLE

b
a

a b a
A B C D
a
a b
b b

b b a
E F G H

b a
a
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD

b
a
State a b
a b a
C D → A B F
A B
a B G C
a b C A C
b b *
D C G
b b a E H F
E F G H
F C G
b a G G E
a H G C

Draw the transition table corresponding to the given DFA


Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Divide the states into two subsets- final and non-final
State a b
→ A B F
B G C
* C A C
D C G
Set of Non Final States (NF): {A,B,D,E,F,G,H} E H F
Set of Final States (F): {C} F C G
G G E
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,B,D,E,F,G,H) with Σ=a

State a b A,B,D,E NF= {A,B,E,G,H}, {D,F}


→ A , F,G,H
B F
B G C
* C A C A,B,E, D,F
D C G G,H
E H F
a Split into two since
F C G
{A,B,E,G,H} go to
G G E C state states within its
H G C set while {D,F} goes
to State {C}
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,B,E,G,H) with Σ=a

State a b NO SPLIT
→ A B F
B G C
* C A C A,B,E, D,F
D C G G,H
E H F
a
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,B,E,G,H) with Σ=b

NF= {A,E},{G},{B,H},{D,F}
State a b A,B,E,
→ A B F G,H
B G C D,F
* C A C A,E
B,H
D C G b
E H F G
F C G b
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,E) with Σ=a

State a b NO SPLIT
→ A B F NF= {A,E},{G},{B,H},{D,F}
B G C
* C A C A,E
B,H
D C G
E H F G
D,F
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (A,E) with Σ=b

NO SPLIT
State a b
→ A B F NF= {A,E},{G},{B,H},{D,F}
B G C D,F
* C A C A,E
B,H
D C G
E H F G
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (B,H) with Σ=a

State a b NO SPLIT
→ A B F
B G C NF= {A,E},{G},{B,H},{D,F}

* C A C A,E
B,H
D C G
E H F
G D,F
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (B,H) with Σ=b

State a b NO SPLIT
→ A B F
B G C NF= {A,E},{G},{B,H},{D,F}

* C A C A,E
B,H
D C G
E H F
G D,F
F C G
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (D,F) with Σ=a

State a b NO SPLIT
→ A B F NF= {A,E},{G},{B,H},{D,F}
B G C
* C A C A,E
B,H
D C G
E H F
G D,F
F C G a
G G E C
H G C
Question 2
DFA MINIMIZATION USING PARTITIONING METHOD
Check O/P of all clubbed states (D,F) with Σ=b

State a b NO SPLIT

→ A B F NF= {A,E},{G},{B,H},{D,F}
B G C
* C A C A,E
B,H D,F
D C G
E H F
F C G G
G G E C
H G C
State a b State a b

DFA MINIMIZATION USING → A


B
B
G
F
C

B, H
A,E B,H
G
D,F
C
PARTITIONING METHOD * C
D
A
C
C
G
*
C A,E C
E H F D,F C G
b G G A,E
a
F C G
G G E
H G C
b b
a b a a
A B C D
a a b a D,
A, B,
a b H
C F
b E
b a
b
b b a a
E F G H
a
b a G
a
a b

Final Output
THANKS

You might also like