Unit I

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
AUTOMATA THEORY& COMPILER DESIGN

UNIT I
INTRODUCTION TO FORMAL LANGUAGES, DFA AND NFA
LANGUAGES:
An alphabet is a finite set of symbols. For example {0, 1} is an alphabet with two symbols, {a, b}
is another alphabet with two symbols and English alphabet is also an alphabet.
A string is a finite sequence of symbols of an alphabet. b, a and aabab are examples of string over
alphabet {a, b} and 0, 10 and 001 are examples of string over alphabet {0, 1}.
A language is a set of strings over an alphabet. Thus {a, ab, baa} is a language (over alphabet
{a,b}) and {0, 111} is a language (over alphabet {0,1}).
The number of symbols in a string is called the length of the string. For a string w its length is
represented by |w|. It can be defined more formally by recursive definition. The empty string (also
called null string) is the string with length 0. That is, it has no symbols.
Let u and v be strings. Then uv denotes the string obtained by concatenating u with v, that is, uv is
the string obtained by appending the sequence of symbols of v to that of u. For example
if u = aab and v = bbab, then uv = aabbbab. Note that vu = bbabaab ≠ uv.
A string x is called a substring of another string y if there are strings u and v such that y = uxv.
Note that u and v may be an empty string. So a string is a substring of itself. A string x is a prefix of
another string y if there is a string v such that y = xv. v is called a suffix of y.
The empty set ⱷ is a language which has no strings. The set {€} is a language which
has one string, namely €. Though € has no symbols, this set has an object in it. So it is
not empty. For any alphabet ∑, the set of all strings over ∑ (including the empty
string) is denoted by ∑*. Thus a language over alphabet ∑ is a subset of ∑*.
 {} The empty set/language, containing no string.

 {€} A language containing one string, the empty string .
 E = {0, 1}
L = {x | x is in E* and x contains an even number of 0‟s}
 E = {0, 1, 2,., 9, .}
L = {x | x is in E* and x forms a finite length real number}
= {0, 1.5, 9.326,.}
 E = {a, b, c,., z, A, B,., Z}
L = {x | x is in E* and x is a Pascal reserved word}
= {BEGIN, END, IF,...}
Department of Computer Science and Engineering
 E = {Pascal reserved words} U { (, ), ., :, ;,...} U {Legal Pascal identifiers}

L = {x | x is in E* and x is a syntactically correct Pascal program}
 E = {English words}
L = {x | x is in E* and x is a syntactically correct English sentence}
OPERATIONS ON LANGUAGES:
 Union
 Intersection
 Difference
 Concatenation
 kleen * closure
Since languages are sets, all the set operations can be applied to languages. Thus the
union, intersection and difference of two languages over an alphabet ∑ are
languages over ∑. The complement of a language L over an alphabet ∑ is ∑* - L and
it is also a language.
Another operation on languages is concatenation. Let L1 and L2 be languages. Then
the concatenation of L1 with L2 is denoted as L1L2 and it is defined as
L1L2 = { uv | u €L1 and v €L2 }. That is L1L2 is the set of strings obtained by
concatenating strings of L1 with those of L2.
For example L1={ab, b}, L2={aaa, abb, aaba}

then L1L2= {abaaa, ababb, abaaba, baaa, babb, baaba}.
For a symbol a €∑ and a natural number k, ak represents the concatenation of k a's.

For a string u €∑* and a natural number k, uk denotes the concatenation of k u's.
Similarly for a language L, Lk means the concatenation of k L's. Hence Lk is the set of
strings that can be obtained by concatenating k strings of L. These powers can be
formally defined recursively.
L* is the set of strings obtained by concatenating zero or more strings of L. This * is

called Kleene closure. For example if L = { aba, bb }, then L* = { €, aba, bb, ababb,
abaaba, bbbb, bbaba, abaababb, bbabaaba, bbababbaba, bbbbbbaba,... }
Thus L+ is the set of strings obtained by concatenating one or more strings of L.

For example if L = { aba, bb }, then L+ = { aba, bb, ababb, abaaba, bbbb, bbaba, ... }
This + is called Positive closure.
Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 2

Regular Expressions
 The language accepted by finite automata can be easily described by simple

expressions called Regular Expressions. It is the most effective way to
represent any language.
 The languages accepted by some regular expression are referred to as Regular
languages.
 A regular expression can also be described as a sequence of pattern that defines
a string.
 Regular expressions are used to match character combinations in strings. String
searching algorithm used this pattern to find the operations on a string.
 In a regular expression, x* means zero or more occurrence of x. It can generate
{e, x, xx, xxx, xxxx, .....}
 In a regular expression, x+ means one or more occurrence of x. It can generate
{x, xx, xxx, xxxx, .....}
Regular languages are languages that can be generated from one-element languages
by applying certain standard operations a finite number of times. They are the
languages that can be recognized by finite automata. These simple operations include
concatenation, union and kleen closure. By the use of these operations regular
languages can be represented by an explicit formula.
Regular expressions can be thought of as the algebraic description of a regular
language. Regular expression can be defined by the following rules:
1. Every letter of the alphabet ∑ is a regular expression.
2. Null string є and empty set Φ are regular expressions.
3. If r1 and r2 are regular expressions, then
(i) r1 + r2 ( union of r1 and r2 )
(ii) r1r2 ( concatenation of r1r2 )
(iii) r1*, r2* ( kleene closure of r1 and r2 ) are also regular expressions
4. If a string can be derived from the rules 1, 2 and 3 then it is also a regular
expression.
Note that a* means zero or more occurrence of a in the string while a+ means that one
or more occurrence of a in the string. That means a* denotes language L = {є , a, aa,
aaa, ….} and a+ represents language L = {a, aa, aaa, ….}. And also note that there can
be more than one regular expression for a given set of strings

Operations on Regular Language
The various operations on regular language are:
Union: If L and M are two regular languages then their union L U M is also a union.
1. L U M = {s | s is in L or s is in M}
Intersection: If L and M are two regular languages then their intersection is also an
intersection.
1. L ⋂ M = {st | s is in L and t is in M}
Kleen closure: If L is a regular language then its Kleen closure L1* will also be a
regular language.
1. L* = Zero or more occurrence of language L.
Example 1:
Write the regular expression for the language accepting all combinations of a's, over
the set ∑ = {a}
Solution:
All combinations of a's means a may be zero, single, double and so on. If a is
appearing zero times, that means a null string. That is we expect the set of {ε, a, aa,
aaa, ....}. So we give a regular expression for this as:
R = a*
That is Kleen closure of a.
Example 2:
Write the regular expression for the language accepting all combinations of a's except
the null string, over the set ∑ = {a}
Solution:
The regular expression has to be built for the language
L = {a, aa, aaa, ....}
This set indicates that there is no null string. So we can denote regular expression as:
R = a+
Example 3:
Write the regular expression for the language accepting all the string containing any
number of a's and b's.
Solution:
The regular expression will be:
R. = (a + b)*
This will give the set as L = {ε, a, aa, b, bb, ab, ba, aba, bab, .....}, any combination of
a and b.
The (a + b)* shows any combination with a and b even a null string.

Example 4:
Write the regular expression for the language accepting all the string which are
starting with 1 and ending with 0, over ∑ = {0, 1}.
Solution:
In a regular expression, the first symbol should be 1, and the last symbol should be 0.
The r.e. is as follows:
R = 1 (0+1)* 0
Example 5:
Write the regular expression for the language starting and ending with a and having
any having any combination of b's in between.
Solution:
R = a b* b
Example 6:
Write the regular expression for the language starting with a but not having
consecutive b's.
Solution: The regular expression has to be built for the language:
[
L = {a, aba, aab, aba, aaa, abab, .....}

The regular expression for the above language is:
R = {a + ab}*
Example 7:
Write the regular expression for the language accepting all the string in which any
number of a's is followed by any number of b's is followed by any number of c's.
Solution: As we know, any number of a's means a* any number of b's means b*, any
number of c's means c*. Since as given in problem statement, b's appear after a's and
c's appear after b's. So the regular expression could be:
R = a* b* c*
Example 8:
Write the regular expression for the language over ∑ = {0} having even length of the
string.
Solution:
The regular expression has to be built for the language:
L = {ε, 00, 0000, 000000, ......}
The regular expression for the above language is:
R = (00)*

Example 9:
Write the regular expression for the language having a string which should have
atleast one 0 and alteast one 1.
Solution:
R = [(0 + 1)* 0 (0 + 1)* 1 (0 + 1)*] + [(0 + 1)* 1 (0 + 1)* 0 (0 + 1)*]
Example 10:
Describe the language denoted by following regular expression
R = (b* (aaa)* b*)*

Solution:
The language can be predicted from the regular expression by finding the meaning of
it. We will first split the regular expression as:
R.E. = (any combination of b's) (aaa)* (any combination of b's)
L = {The language consists of the string in which a's appear triples, there is no
restriction on the number of b's}
Example 11:
Write the regular expression for the language L over ∑ = {0, 1} such that all the string
do not contain the substring 01.
Solution:
The Language is as follows:
L = {ε, 0, 1, 00, 11, 10, 100, .....}
The regular expression for the above language is as follows:
R = (1* 0*)
Example 12:
Write the regular expression for the language containing the string over {0, 1} in
which there are at least two occurrences of 1's between any two occurrences of 1's
between any two occurrences of 0's.
Solution: At least two 1's between two occurrences of 0's can be denoted by
(0111*0)*.
Similarly, if there is no occurrence of 0's, then any number of 1's are also allowed.
Hence the r.e. for required language is:
R = (1 + (0111*0))*

Example 13:
Write the regular expression for the language containing the string in which every 0 is
immediately followed by 11.
Solution:
The regular expectation will be:
R = (011 + 1)*
Order for precedence for the operations is: kleen > concatenation > union. This
rule allows us to lessen the use of parentheses while writing the regular expression.
For example a + b*c is the simplified form of (a + ((b)*c)). Note that (a + b)* is not
the same as a + b*, a + b* is (a + (b)*).
Finite Automata
 Finite automata are used to recognize patterns.
 It takes the string of symbol as input and changes its state accordingly. When the desired
symbol is found, then the transition occurs.
 At the time of transition, the automata can either move to the next state or stay in the
same state.
 Finite automata have two states, Accept state or Reject state. When the input string is
processed successfully, and the automata reached its final state, then it will accept.
Finite Automata (FA) is the simplest machine to recognize patterns. It is used to

characterize a Regular Language. The finite automata or finite state machine is an
abstract machine that has five elements or tuples.
It has a set of states and rules for moving from one state to another but it depends
upon the applied input symbol. Based on the states and the set of rules the input string
can be either accepted or rejected.
Basically, it is an abstract model of a digital computer which reads an input string and
changes its internal state depending on the current input symbol. Every automaton
defines a language i.e. set of strings it accepts. The following figure shows some
essential features of general automation.
Formal Definition of FA
A finite automaton is a collection of 5-tuple (Q, ∑, δ, q0, F), where:
Q : finite set of states
∑ : finite set of the input symbol
q0: initial state
F : final state
δ : Transition function

Finite Automata Model:
Finite automata can be represented by input tape and finite control.

Input tape: It is a linear tape having some number of cells. Each input symbol is placed in each
cell.
Finite control: The finite control decides the next state on receiving particular input from input
tape. The tape reader reads the cells one by one from left to right, and at a time only one input
symbol is read.
Graphical Representation
An FA can be represented by digraphs called state diagram. In which:
1. The state is represented by vertices.
2. The arc labeled with an input character show the transitions.
3. The initial state is marked with an arrow.
4. The final state is denoted by the double circle.
FA is characterized into two types:
1) Deterministic Finite Automata (DFA):
DFA consists of 5 tuples (Q, ∑, δ, q0, F)
Q : set of all states.
∑ : set of input symbols. ( Symbols which machine takes as input )
q : Initial state. ( Starting state of a machine )
F : set of final state.
δ : Transition Function, defined as ? : Q X ∑ --> Q.

In a DFA, for a particular input character, the machine goes to one state only.
A transition function is defined on every state for every input symbol. Also in DFA
null (or ?) move is not allowed, i.e., DFA cannot change state without any input
character.
For example, construct a DFA which accept a language of all strings ending with „a‟.
Given: ∑= {a,b}, q = {q0}, F={q1}, Q = {q0, q1}
First, consider a language set of all the possible acceptable strings in order to
construct an accurate state transition diagram.
L = {a, aa, aaa, aaaa, aaaaa, ba, bba, bbbaa, aba, abba, aaba, abaa}
Above is simple subset of the possible acceptable strings there can many other strings
which ends with „a‟ and contains symbols {a,b}.
Strings not accepted are ab, bb, aab, abbb, etc.

State transition table for above automaton,
Regular expression to ∈ -NFA

∈ -NFA is similar to the NFA but have minor difference by epsilon move. This automaton
replaces the transition function with the one that allows the empty string ∈ as a possible input.
The transitions without consuming an input symbol are called ∈ -transitions. In the state
diagrams, they are usually labeled with the Greek letter ∈ .

∈ -transitions provide a convenient way of modeling the systems whose current states are not
precisely known: i.e., if we are modeling a system and it is not clear whether the current state
(after processing some input string) should be q or q‟, then we can add an ∈ -transition between
these two states, thus putting the automaton in both states simultaneously.
One way to implement regular expressions is to convert them into a finite automaton, known as
an ∈ -NFA (epsilon-NFA). An ∈ -NFA is a type of automaton that allows for the use of “epsilon”
transitions, which do not consume any input. This means that the automaton can move from one
state to another without consuming any characters from the input string.
The process of converting a regular expression into an ∈ -NFA is as follows:
1. Create a single start state for the automaton, and mark it as the initial state.
2. For each character in the regular expression, create a new state and add an edge between
the previous state and the new state, with the character as the label.
3. For each operator in the regular expression (such as “*” for zero or more, “+” for one or
more, and “?” for zero or one), create new states and add the appropriate edges to
represent the operator.
4. Mark the final state as the accepting state, which is the state that is reached when the
regular expression is fully matched.
Rules for construction of ∈ -NFA :
∈ -NFA for a+ :
This structure is for a+ which means there must be at least one „a‟ in the expression. It is
preceded by epsilon and also succeeded by one. There is epsilon feedback from state q2 to q1 so
that there can be more than one „a‟ in the expression.

∈-NFA for a* :
This structure is for a* which means there can be any number of „a‟ in the expression, even 0.
The previous structure is just modified a bit so that even if there is no input symbol, i.e. if the
input symbol is null, then also the expression is valid
∈-NFA for a+b :
This structure accepts either a or b as input. So there are two paths, both of which lead to the
final state
∈-NFA for ab :
For concatenation, a must be followed by b. Only then it can reach the final state. Both structures
are allowed here but as it is ∈ -NFA so the second structure is recommended.

Common regular expression used in make ∈ -NFA:
Example: Create a ∈ -NFA for regular expression: (a/b)*a

∈-NFA of Regular Language L = (0+1)*(00 + 11) :
L = (0+1)*(00 + 11) can be divided into two parts – (0+1)* and (00 + 11). Since they are
concatenated, the two parts will be linearly connected to each other.
The first part can be drawn using the third rule and the second rule. (0+1) is easy to draw
following the third rule and considering (0+1) as one unit, (0+1)* can also be drawn applying the
second rule. Here‟s the first part as follows.

The second part can be drawn with the help of fourth rule. In the fourth rule, a and b both are 0.
That is how we construct 00. Similarly, we can construct 11. now since they are connected by „+‟
sign, there will be two paths connecting both these structures. Here‟s the second part as follows.
The Final ∈ -NFA will be : Connecting the two structures linearly gives us our final ∈ -NFA.
∈-NFA of Regular Language L = b + ba* :
L =b + ba* has two terms. The first term is fairly easy to construct. Since both the terms are
connected by „+‟ sign, there will be two paths coming out of the first node. The second term is to
be drawn following the second rule of construction, a* which is simply preceded by b. The Final
∈ -NFA will be :

2) Nondeterministic Finite Automata(NFA): NFA is similar to DFA except following

additional features:
1. Null (or ?) move is allowed i.e., it can move forward without reading symbols.
2. Ability to transmit to any number of states for a particular input.
However, these above features don‟t add any power to NFA. If we compare both in
terms of power, both are equivalent.
Due to the above additional features, NFA has a different transition function, the rest
is the same as DFA.
Transition Function
Q X (∑ U ∈) --> 2Q.
As you can see in the transition function is for any input including null (or ?), NFA
can go to any state number of states. For example, below is an NFA for the above
problem.
As you can see in the transition function is for any input including null (or ?), NFA
can go to any state number of states. For example, below is an NFA for the above
problem.

State Transition Table for above Automaton,
One important thing to note is, in NFA, if any path for an input string leads to a
final state, then the input string is accepted. For example, in the above NFA, there
are multiple paths for the input string “00”. Since one of the paths leads to a final
state, “00” is accepted by the above NFA.
Every DFA is NFA but not vice-versa. Yet there is a way to convert an NFA to
DFA, so there exists an equivalent DFA for every NFA.
1. Both NFA and DFA have the same power and each NFA can be
translated into a DFA.
2. There can be multiple final states in both DFA and NFA.
3. NFA is more of a theoretical concept.
4. DFA is used in Lexical Analysis in Compiler.
5. If the number of states in the NFA is N then, its DFA can have maximum
2N number of states.
Design an NFA with ∑ = {0, 1} accepts all string ending with 01.
Design an NFA with ∑ = {0, 1} in which double '1' is followed by double '0'.

Design an NFA in which all the string contains a substring 1110.
Design an NFA with ∑ = {0, 1} accepts all string in which the third symbol from the right
end is always 0.
NFA with ∑ = {0, 1} accepts all strings with 01.
NFA with ∑ = {0, 1} and accept all string of length atleast 2.
Design a DFA with ∑ = {0, 1} accepts those string which starts with 1 and ends with 0.
Design a DFA with ∑ = {0, 1} accepts the only input 101.

Design DFA with ∑ = {0, 1} accepts even number of 0's and even number of 1's.
This FA will consider four different stages for input 0 and input 1. The stages could be:
Here q0 is a start state and the final state also. Note carefully that a symmetry of 0's and
1's is maintained. We can associate meanings to each state as:
q0: state of even number of 0's and even number of 1's.

q1: state of odd number of 0's and even number of 1's.
q2: state of odd number of 0's and odd number of 1's.
q3: state of even number of 0's and odd number of 1's.
Design DFA with ∑ = {0, 1} accepts the set of all strings with three consecutive 0's.

Design a DFA L(M) = {w | w ε {0, 1}*} and W is a string that does not contain consecutive 1's.
Design a FA with ∑ = {0, 1} accepts the strings with an even number of 0's followed by single 1
Converting NFA to DFA:
Assume an NFA for a language L with <Q,Σ,q0,δ, F> where
 Q is the finite set of states

 Σ is the input symbols
 q0 is the start state
 δ is the transition function
 F is the final state
It is converted to a DFA <Q′,Σ,q0,δ′,F′> where
 Q′ is the new finite set of states

 Σ is the input symbols

 q0 is the start state

 δ′ is the transition function
 F′ is the new final state
While converting an NFA with n states to a DFA, 2n possible set of states can be reachable but
not necessarily reached in the DFA.
The following steps are followed to convert a given NFA to a DFA-

1. At the beginning Q′ = ∅ .
2. Add q0 to Q′.
3. For every state in Q′, find the possible set of states for each input symbol using the
transition function of the NFA. If the acquired set of states is not present in Q′, add it.
4. The final state F′ will be all those states that have F in them.
Convert the given NFA to DFA.
For the given transition diagram we will first construct the transition table.
State 0 1
→q0 q0 q1
q1 {q1, q2} q1
*q2 q2 {q1, q2}
Now we will obtain δ' transition for state q0.

δ'([q0], 0) = [q0]
δ'([q0], 1) = [q1]

The δ' transition for state q1 is obtained as:
δ'([q1], 0) = [q1, q2] (new state generated)
δ'([q1], 1) = [q1]
δ'([q2], 0) = [q2]
δ'([q2], 1) = [q1, q2]
Now we will obtain δ' transition on [q1, q2].
δ'([q1, q2], 0) = δ(q1, 0) ∪ δ(q2, 0)
= {q1, q2} ∪ {q2}
= [q1, q2]
δ'([q1, q2], 1) = δ(q1, 1) ∪ δ(q2, 1)
= {q1} ∪ {q1, q2}
= {q1, q2}
= [q1, q2]
The state [q1, q2] is the final state as well because it contains a final state q2. The transition table
for the constructed DFA will be:
State 0 1
→[q0] [q0] [q1]
[q1] [q1, q2] [q1]
*[q2] [q2] [q1, q2]
*[q1, q2] [q1, q2] [q1, q2]
The Transition diagram will be:
The state q2 can be eliminated because q2 is an unreachable state.

Convert the given NFA to DFA.
For the given transition diagram we will first construct the transition table.
State 0 1
→q0 {q0, q1} {q1}
*q1 ϕ {q0, q1}

Now we will obtain δ' transition for state q0.
δ'([q0], 0) = {q0, q1}

= [q0, q1] (new state generated)
δ'([q0], 1) = {q1} = [q1]
δ'([q1], 0) = ϕ
δ'([q1], 1) = [q0, q1]
Now we will obtain δ' transition on [q0, q1].

δ'([q0, q1], 0) = δ(q0, 0) ∪ δ(q1, 0)
= {q0, q1} ∪ ϕ
= {q0, q1}
= [q0, q1]
δ'([q0, q1], 1) = δ(q0, 1) ∪ δ(q1, 1)

= {q1} ∪ {q0, q1}
= {q0, q1}
= [q0, q1]

As in the given NFA, q1 is a final state, then in DFA wherever, q1 exists that state becomes a final state.
Hence in the DFA, final states are [q1] and [q0, q1]. Therefore set of final states F = {[q1], [q0, q1]}.
The transition table for the constructed DFA will be:
State 0 1
→[q0] [q0, q1] [q1]
*[q1] ϕ [q0, q1]
*[q0, q1] [q0, q1] [q0, q1]

The Transition diagram will be:
Conversion of NFA with ε move to NFA without ε
Steps:
1. Find out all the ε transitions from each state from Q. That will be called as ε-
closure{q1} where qi ∈ Q.
2. Then δ' transitions can be obtained. The δ' transitions mean a ε-closure on δ
moves.
3. Repeat Step-2 for each input symbol and each state of given NFA.
4. Using the resultant states, the transition table for equivalent NFA without ε can
be built.

Convert the following NFA with ε to NFA without ε.
We will first obtain ε-closures of q0, q1 and q2 as follows:

ε-closure(q0) = {q0}
ε-closure(q1) = {q1, q2}
Now the δ' transition on each input symbol is obtained as:
δ'(q0, a) = ε-closure(δ(δ^(q0, ε),a))

= ε-closure(δ(ε-closure(q0),a))
= ε-closure(δ(q0, a))
= ε-closure(q1)
= {q1, q2}
δ'(q0, b) = ε-closure(δ(δ^(q0, ε),b))

= ε-closure(δ(ε-closure(q0),b))
= ε-closure(δ(q0, b))
=Ф
Now the δ' transition on q1 is obtained as:
= ε-closure(δ(q1, q2), a)
= ε-closure(δ(q1, a) ∪ δ(q2, a))
= ε-closure(Ф ∪ Ф)
=Ф
= ε-closure(δ(q1, q2), b)
= ε-closure(δ(q1, b) ∪ δ(q2, b))
= ε-closure(Ф ∪ q2)
= {q2}

The δ' transition on q2 is obtained as:

= ε-closure(δ(q2, a))
= ε-closure(Ф)
=Ф

= ε-closure(δ(q2, b))
= ε-closure(q2)
= {q2}
Now we will summarize all the computed δ' transitions:
δ'(q0, a) = {q0, q1}
δ'(q0, b) = Ф
δ'(q1, a) = Ф
δ'(q1, b) = {q2}
δ'(q2, a) = Ф
δ'(q2, b) = {q2}
The transition table can be:
States a b
→q0 {q1, q2} Ф
*q1 Ф {q2}
*q2 Ф {q2}
State q1 and q2 become the final state as ε-closure of q1 and q2 contain the final state q2.
The NFA can be shown by the following transition diagram:

ε-closure: ε-closure for a given state A means a set of states which can be reached from the state
A with only ε(null) move including the state A itself.
Steps for converting NFA with ε to DFA:

Step 1: We will take the ε-closure for the starting state of NFA as a starting state of DFA.
Step 2: Find the states for each input symbol that can be traversed from the present. That means
the union of transition value and their closures for each state of NFA present in the current state
of DFA.
Step 3: If we found a new state, take it as current state and repeat step 2.
Step 4: Repeat Step 2 and Step 3 until there is no new state present in the transition table of
DFA.
Step 5: Mark the states of DFA as a final state which contains the final state of NFA.
Convert the NFA with ε into its equivalent DFA.
Solution:
Let us obtain ε-closure of each state.
ε-closure {q0} = {q0, q1, q2}

ε-closure {q1} = {q1}

Now, let ε-closure {q0} = {q0, q1, q2} be state A.

δ'(A, 0) = ε-closure {δ((q0, q1, q2), 0) }
= ε-closure {δ(q0, 0) ∪ δ(q1, 0) ∪ δ(q2, 0) }
= ε-closure {q3}
= {q3} call it as state B.
δ'(A, 1) = ε-closure {δ((q0, q1, q2), 1) }

= ε-closure {δ((q0, 1) ∪ δ(q1, 1) ∪ δ(q2, 1) }
= ε-closure {q3}
= {q3} = B.
For state B:
δ'(B, 0) = ε-closure {δ(q3, 0) }
=ϕ
δ'(B, 1) = ε-closure {δ(q3, 1) }
= ε-closure {q4}
= {q4} i.e. state C
For state C:
δ'(C, 0) = ε-closure {δ(q4, 0) }

=ϕ
δ'(C, 1) = ε-closure {δ(q4, 1) }
=ϕ
The DFA will be,

Convert the given NFA into its equivalent DFA.
ε-closure(q0) = {q0, q1, q2}

ε-closure(q1) = {q1, q2}
Now we will obtain δ' transition. Let ε-closure(q0) = {q0, q1, q2} call it as state A.
δ'(A, 0) = ε-closure{δ((q0, q1, q2), 0)}

= ε-closure{δ(q0, 0) ∪ δ(q1, 0) ∪ δ(q2, 0)}
= ε-closure{q0}
= {q0, q1, q2}
δ'(A, 1) = ε-closure{δ((q0, q1, q2), 1)}

= ε-closure{δ(q0, 1) ∪ δ(q1, 1) ∪ δ(q2, 1)}
= ε-closure{q1}
= {q1, q2} call it as state B
δ'(A, 2) = ε-closure{δ((q0, q1, q2), 2)}

= ε-closure{δ(q0, 2) ∪ δ(q1, 2) ∪ δ(q2, 2)}
= ε-closure{q2}
= {q2} call it state C
Thus we have obtained
δ'(A, 0) = A
δ'(A, 1) = B
δ'(A, 2) = C
Now we will find the transitions on states B and C for each input.

δ'(B, 0) = ε-closure{δ((q1, q2), 0)}
= ε-closure{δ(q1, 0) ∪ δ(q2, 0)}
= ε-closure{ϕ}
=ϕ
δ'(B, 1) = ε-closure{δ((q1, q2), 1)}

= ε-closure{δ(q1, 1) ∪ δ(q2, 1)}
= ε-closure{q1}
= {q1, q2} i.e. state B itself
δ'(B, 2) = ε-closure{δ((q1, q2), 2)}

= ε-closure{δ(q1, 2) ∪ δ(q2, 2)}
= ε-closure{q2}
= {q2} i.e. state C itself
δ'(B, 0) = ϕ
δ'(B, 1) = B
δ'(B, 2) = C
Now we will obtain transitions for C:
δ'(C, 0) = ε-closure{δ(q2, 0)}
= ε-closure{ϕ}
=ϕ
= ε-closure{ϕ}
=ϕ
= {q2}
δ'(C, 0) = ϕ
δ'(C, 1) = ϕ
δ'(C, 2) = C
Hence the DFA is

Arden’s Theorem
In order to find out a regular expression of a Finite Automaton, we use Arden‟s Theorem along
with the properties of regular expressions.
Let P and Q be two regular expressions. If P does not contain null string, then R = Q + RP has
a unique solution that is R = QP*
Proof −
R = Q + (Q + RP)P [After putting the value R = Q + RP]
= Q + QP + RPP
When we put the value of R recursively again and again, we get the following equation −
R = Q + QP + QP2 + QP3…..
R = Q (ε + P + P2 + P3 + …. )
R = QP* [As P* represents (ε + P + P2 + P3 + ….) ]
Hence, proved.
Assumptions for Applying Arden’s Theorem

 The transition diagram must not have NULL transitions
It must have only one initial state
Construct a regular expression corresponding to the automata given below

Here the initial state and final state is q1.
The equations for the three states q1, q2, and q3 are as follows −
q1 = q1a + q3a + ε (ε move is because q1 is the initial state)
q2 = q1b + q2b + q3b
q3 = q2a
Now, we will solve these three equations −
q2 = q1b + q2b + q3b
= q1b + q2b + (q2a)b (Substituting value of q3)
= q1b + q2(b + ab)
= q1b (b + ab)* (Applying Arden‟s Theorem)
q1 = q1a + q3a + ε
= q1a + q2aa + ε (Substituting value of q3)
= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)
= q1(a + b(b + ab)*aa) + ε
= ε (a+ b(b + ab)*aa)*
= (a + b(b + ab)*aa)*
Hence, the regular expression is (a + b(b + ab)*aa)*.
Construct a regular expression corresponding to the automata given below

Here the initial state is q1 and the final state is q2
Now we write down the equations −
q1 = q10 + ε
q2 = q11 + q20
q3 = q21 + q30 + q31
Now, we will solve these three equations −
q1 = ε0* [As, εR = R]
So, q1 = 0*
q2 = 0*1 + q20
So, q2 = 0*1(0)* [By Arden‟s theorem]
Hence, the regular expression is 0*10*.
Design a DFA in which every 'a' should be followed by 'b'
Design a DFA such that: L = {anbm | n,m ≥ 1}

Given: Input alphabet, Σ={a, b}
Language L = {ab, aab, aaab, abbb, aabb, aaaabbbb, ...}

Design a DFA in which set of all strings can be accepted which ends with ab.
Given: Input alphabet, Σ={a, b}
Language L ={ab, abab, abaabbab, abbab, bbabaabab ….}
Conversion of Finite Automata to Regular Grammar

1. Repeat the process for every state
2. Begin the process from start state
3. Write the production as the output followed by the state on which the transition is going
4. And at the last add ε because that's is required to end the derivation
Pick start state and output is on symbol 'a' we are going on state B
So we will write as :
A -> aB
And then we will pick state B and then we will go for each output.
so we will get the below production.
B -> aB/bB/ε
So final we got regular grammar as:

A -> aB
B -> aB/bB/ε

Conversion of Right Linear Grammar to Finite Automata
Converting regular grammar to Finite Automata is simple.
Follow the steps:
1. Start from the first production
2. And then for every left alphabet go to SYMBOL followed by it
3. Start State: It will be the first production's state
4. Final State: Take those states which end up with input alphabets. eg. State A and C are
below CFG
A -> aB/bA/b
B -> aC/bB
C -> aA/bC/a
Now see the output
Draw the transition diagram for an identifier

letter,digit
letter
Enumerate the difference between DFA and NFA
DFA NFA
The machine can move without consuming any

Each symbol causes a move symbol and sometimes there is no possible move,
sometime there are more than one possible move
Next state is completed by determining The state is only partially determined by the
current state and current symbol current state and current input symbol
The transition function returns only one The transition function returns zero, one or more
state.(i.e)  : Q X   Q states.(i.e)  : Q X   2Q

Construct a DFA for the following:
i. All strings that contain exactly 4 zeros
ii. All strings that don’t contain the substring 110. DEC 2011
Give English description of the following language (0+1)*1*.

The language L has words w whose letters are from ‘0’and ‘1’that contain any number of 0
and 1 followed by any number of 1’s except for one ‘a’ that is not the first or last letter of w.
Draw a NFA to accept the strings containing the substring 0101.
n
Construct a DFA for the language L = { a b , n > 0}.
a b
s0 s2
s1
a, b
a, b
b
s3

Construct NFA equivalent to regular expression (0+1)*01.
1, 0
0 1
S0 S1 S2
Represent a language over ∑ ={1} having (i) even length of string (ii) odd length of a String
(i) Even length of string R=(11)*
(ii) Odd length of the string R=1(11)*
Give regular expression for the following.

L1=set of all strings of 1 and 0 ending in 00
Regular expression RE = (0+1)*00
L2=set of all stings of 0 and 1 beginning with 0 and ending with 1.
Regular expression RE = 0(0+1)*1
Give Applications of Finite Automata.

 In compiler construction.
 In switching theory and design of digital circuits.
 To verify the correctness of a program.
 Design and analysis of complex software and hardware systems.
 To design finite state machines such as Moore and mealy machines.
List the operators used in regular expression and their precedence.
Operator Precedence
Keene Closure 1
Positive Closure 2
Concatenation 3
Union 4

Write a R.E to denote a language L which accepts all the strings which begin or end with
either 00 or 11.
The R.E consists of two parts:
L1= (00+11) (any no of 0‟s and 1‟s) =(00+11)(0+1)*
L2= (any no of 0‟s and 1‟s) (00+11) =(0+1)*(00+11)
Hence R.E R=L1+L2 = [(00+11) (0+1)*] + [(0+1)* (00+11)]
Construct NFA for the regular expression a*b*. or Construct a finite automaton for the
regular expression 0*1*
Construct a DFA for the language over {0, 1}* such that it contains “000” as a substring.
Q0 Q1 Q2 Q3
Give language of regular expression a? (a / b)*.

A language L which accepts all the strings which begin with zero or one a and followed
by any no of a‟s and b‟s.
Generate NFA- € to represent a*b | c.
€ b
a €
€ €
€
€
€
c

Unit I

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

Unit I

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit I

Uploaded by

Copyright:

Available Formats

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

AUTOMATA THEORY& COMPILER DESIGN

 {} The empty set/language, containing no string.

 E = {Pascal reserved words} U { (, ), ., :, ;,...} U {Legal Pascal identifiers}

For example L1={ab, b}, L2={aaa, abb, aaba}

For a symbol a €∑ and a natural number k, ak represents the concatenation of k a's.

L* is the set of strings obtained by concatenating zero or more strings of L. This * is

Thus L+ is the set of strings obtained by concatenating one or more strings of L.

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 2

 The language accepted by finite automata can be easily described by simple

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 3

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 4

L = {a, aba, aab, aba, aaa, abab, .....}

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 5

R = (b* (aaa)* b*)*

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 6

Finite Automata (FA) is the simplest machine to recognize patterns. It is used to

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 7

Finite automata can be represented by input tape and finite control.

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 8

Strings not accepted are ab, bb, aab, abbb, etc.

Regular expression to ∈ -NFA

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 9

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 10

∈-NFA for a+b :

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 11

Example: Create a ∈ -NFA for regular expression: (a/b)*a

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 12

∈-NFA of Regular Language L = (0+1)*(00 + 11) :

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 13

∈-NFA of Regular Language L = b + ba* :

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 14

2) Nondeterministic Finite Automata(NFA): NFA is similar to DFA except following

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 15

State Transition Table for above Automaton,

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 16

Design an NFA in which all the string contains a substring 1110.

NFA with ∑ = {0, 1} accepts all strings with 01.

NFA with ∑ = {0, 1} and accept all string of length atleast 2.

Design a DFA with ∑ = {0, 1} accepts the only input 101.

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 17

q0: state of even number of 0's and even number of 1's.

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 18

Converting NFA to DFA:

Assume an NFA for a language L with <Q,Σ,q0,δ, F> where

 Q is the finite set of states

 Q′ is the new finite set of states

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 19

 q0 is the start state

The following steps are followed to convert a given NFA to a DFA-

Convert the given NFA to DFA.

*q2 q2 {q1, q2}

Now we will obtain δ' transition for state q0.

Automata Theory & Compiler Design Mr. P.Krishnamoorthy Page 20

→[q0] [q0] [q1]

[q1] [q1, q2] [q1]

*[q2] [q2] [q1, q2]

R = (b* (aaa)* b)

= ε (a+ b(b + ab)aa)

Hence, the regular expression is 010.