Chapter 2 RegularExpressions

Regular Expressions
Chapter 2
Topics
1) Regular Expressions(RE)
2) FA to RE conversion and vice-versa
3) How to prove whether a given language is
regular or not?
4) Closure properties of regular languages
1
RE’s: Introduction
 Regular expressions are an algebraic way to describe
languages.
 They describe exactly the regular languages.
 If E is a regular expression, then L(E) is the language it
defines.
 A regular is expression (sometimes called a rational
expression) in computer science and formal languages
theory.
 A sequence of characters that define a search pattern. usually
this pattern is then used by in string searching algorithm
"find" or "find and replace" operations on strings.
2
RE’s: Introduction…..
 Regular expressions are the most effective way to represent
any language.
 A regular expression can be defined as a language or string
accepted by a finite automata.
 Basis 1: If a is any symbol, then a is a RE, and L(a) = {a}.
 Note: {a} is the language containing one string, and that
string is of length 1.
 Basis 2: ε is a RE, and L(ε) = {ε}.
 Basis 3: ∅ is a RE, and L(∅) = ∅.
3
RE’s: Introduction…
The set of regular expression of defined by the following rules:
(i) Every letter of ∑ can be made into regular expression, null
string,€ itself is a regular expression.
(ii)If r1 and r2 are regular expression, then
(a) (r1) (b) r1r2
(c) r1+r2 (d) r*1
+
(e) r1 are also regular expression
(iii) Nothing else is regular expression.
Regular Expressions vs. Finite Automata
 Offers a declarative way to express the pattern of any
string we want to accept
 E.g., 01*+ 10*
 Automata => more machine-like
< input: string , output: [accept/reject] >
 Regular expressions => more program syntax-like
 Unix environments heavily use regular expressions
 E.g., bash shell, grep, vi & other editors, sed
 Perl scripting – good for string processing
 Lexical analyzers such as Lex or Flex
5
Regular Expressions
Regular = Finite Automata
expressions (DFA, NFA, -NFA)
Syntactical
expressions Automata/machines
Regular
Languages
Formal language classes
6
Language Operators
 Union of two languages:
 L U M = all strings that are either in L or M
 Note: A union of two languages produces a third
language
 Concatenation of two languages:
 L . M = all strings that are of the form xy
s.t., x  L and y  M
 The dot operator is usually omitted
 i.e., LM is same as L.M
7
“i” here refers to how many strings to concatenate from the parent language L
to produce strings in the language L i
Kleene Closure (the * operator)

 Kleene Closure of a given language L:

L0= {}

L1= {w | for some w  L}

L2= { w1w2 | w1  L, w2  L (duplicates allowed)}

Li= { w1w2…wi | all w’s chosen are  L (duplicates allowed)}

(Note: the choice of each wi is independent)

L* = Ui≥0 Li (arbitrary number of concatenations)
Example:
 Let L = { 1, 00}

L0= {}
 L1= {1,00}
 L2= {11,100,001,0000}
 L3= {111,1100,1001,10000,000000,00001,00100,0011}

L* = L0 U L1 U L2 U … 8
Kleene Closure (special notes)
 L* is an infinite set iff |L|≥1 and L≠{} Why?

 If L={}, then L* = {} Why?
 If L = Φ, then L* = {} Why?
Σ* denotes the set of all words over an alphabet Σ

 Therefore, an abbreviated way of saying there is
an arbitrary language L over an alphabet Σ is:

 L  Σ*
9
Building Regular Expressions
(i) The constants ϵ(null string) and ɸ(empty set) are regular
expression,
denote the languages {ϵ} and ɸ, respectively.
That is, L(ϵ) = {ϵ} , and L(ɸ)= ɸ.
(ii)
If a is any symbol, then a is regular expression. This
expression denotes
the language {a}. That is L(a)={a}.
(iii) A variable, usually capitalized and such as L is a variable,
representing any language.
Building Regular Expressions
 Let E be a regular expression and the language represented
by E is L(E)
 Then:
 (E) = E
 L(E + F) = L(E) U L(F)
 L(E F) = L(E) L(F)
 L(E*) = (L(E))*
11
identity Rules for RE
The two regular expression’s P and Q are equivalent (denoted as P=Q) if and
only if P represents the same set of strings as Q does.
For showing the equivalence of two regular expressions we need to show some
identities of regular expression’s
Let P, Q and R be the regular expressions then the identity rules are as follows −
 εR=R ε=R
 ε*= ε ε is null string  (P+Q)R=PR+QR
 (Φ)*= ε Φ is empty string  (P+Q)*=(P*Q*)*=(P*+Q*)*
 ΦR=R Φ= Φ  R*(ε+R)=( ε+R)R*=R*
 Φ+R=R  (R+ε)*=R*
 R+R=R  Ε+R*=R*
 RR*=R*R=R+  (PQ)*P=P(QP)*
 (R*)*=R*  R*R+R=R*R
 Ε+RR*=R*
12
Example: how to use these regular expression
properties and language operators?
 L = { w | w is a binary string which does not
contain two consecutive 0s or two
consecutive 1s anywhere)
 E.g., w = 01010101 is in L, while w = 10010 is not
in L • Regular expression for
 Goal: Build a regular expression for L the four cases:
Case A: (01)*
 Four cases for w: Case B: (10)*
 Case A: w starts with 0 and |w| is even Case C: 0(10)*
 Case B: w starts with 1 and |w| is even Case D: 1(01)*
 Case C: w starts with 0 and |w| is odd
 Case D: w starts with 1 and |w| is odd
Since L is the union of all 4 cases:

Reg Exp for L = (01)* + (10)* + 0(10)* + 1(01)*
If we introduce  then the regular expression can be simplified to:
Reg Exp for L = ( +1)(01)*( +0)
13
Examples
Write the regular expression for the
language accepting all the string r.e. = (a + b)*
containing any number of a's and b's.

language accepting all combinations of a's
R E= a+
except the null string, over the set ∑ = {a}

language accepting all combinations of a's, RE = a*
over the set ∑ = {a}
language accepting all strings start with a RE = a(a+b)*b
and ends with b, over the set ∑ = {a,b}
14
Cont…
Write the regular expression for the finite language which accepting
all the strings, having the length exactly two over ∑ = {a, b}.
Solution: Language for given example is given below
L = {aa, ab, ba, bb} // only 4 strings are possible for given condition
Regular expression for above language is given below
L(R) = {aa + ab +ba+bb}
Write the regular expression for the language which accepting all the
strings, having the first symbol should be “b” and last symbol should
be “a” over ∑ = {a, b}.
Solution: Language for given example is given below
L = {ba, baa, baba, bbaa, baaaa, babbbba……….. }
Regular expression for above language is given by: L(R) = b (a+b)* a
15
Precedence of Operators
 Highest to lowest
* operator (star)

. (concatenation)
 + operator
 Example:
 01* + 1 = ( 0 . ((1)*) ) + 1
16
Equivalence between regular expressions
and finite automata
Strategy:
 Convert regular expression to an -NFA
 Convert a DFA to a regular expression

Finite Automata (FA) & Regular
Expressions (Reg Ex)
 To show that they are interchangeable, consider the
following theorems:
 Theorem 1: For every DFA A there exists a regular
Proofs expression R such that L(R)=L(A)

in the book 
Theorem 2: For every regular expression R there exists
an  -NFA E such that L(E)=L(R)
 -NFA NFA
Theorem 2 Kleene Theorem
Reg Ex DFA
Theorem 1
18
DFA to RE construction
 The two popular methods for converting a given DFA to its
regular expression are-
19
DFA to RE construction
DFA Reg Ex
Theorem 1
Informally, trace all distinct paths (traversing cycles only once)
from the start state to each of the final states and enumerate all
the expressions along the way.
1 0 0,1
Example: q0 0 q1 1 q2
(1*) 0 (0*) 1 (0 + 1)*
1* 00* 1 (0+1)*
Q) What is the language?

1*00*1(0+1)* 20
Arden's Theorem
In order to find out a regular expression of a Finite Automaton, we use

Arden’s Theorem along with the properties of regular expressions.
Statement − Conditions-
• Let P and Q be two regular To use Arden’s Theorem, following
expressions. conditions must be satisfied-
• If P does not contain null string • The transition diagram must not
have any ∈ transitions.
(I) R = Q + RP has a unique
• There must be only a single initial
solution,
state.
(II) R = QP*
Cont…
Proof −
R = Q + (Q + RP)P [After putting the value R = Q +
RP]
= Q + QP + RPP
When we put the value of R recursively again and again, we get the
following equation −
R = Q + QP + QP2 + QP3…..
R = Q (ϵ + P + P2 + P3 + …. )
R = QP* [As P* represents (ϵ
+ P + P2 + P3 + ….) ]
proved.
Assumptions for Applying Arden’s
Theorem
• The transition diagram must not have NULL
transitions
• It must have only one initial state:
Method
Step 1 − Create equations as the following form for all the states of the DFA
having n states with initial state q1.
q1 = q1R11 + q2R21 + … + qnRn1 + ϵ
q2 = q1R12 + q2R22 + … + qnRn2
…………………………………………………………….
…………………………………………………………….
qn = q1R1n + q2R2n + … + qnRnn

Rij represents the set of labels of edges from qi to qj, if no such edge exists,
then Rij = ɸ
Step 2 − Solve these equations to get the equation for the final state in terms
of
Example 1
Construct a regular expression

corresponding to the automata given
below −
Solution −
Here the initial state and final state
is q1. Now, we will solve these three equations −
q2 = q1b + q2b + q3b
The equations for the three states
q1, q2, and q3 are as follows − = q1b + q2b + (q2a)b (Substituting value of q3)
q1 = q1a + q3a + ε (ε move is because = q1b + q2(b + ab)
q1 is the initial state) = q1b (b + ab)* (Applying Arden’s Theorem)
q2 = q1b + q2b + q3b q1 = q1a + q3a + ε
q3 = q2a
24
Cont.…
= q1a + q2aa + ε (Substituting value of q3)
= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)
= q1(a + b(b + ab)*aa) + ε
= ε (a+ b(b + ab)*aa)*
= (a + b(b + ab)*aa)*
Hence, the regular expression is (a + b(b + ab)*aa)*.
25
Example 2
Find regular expression for the following
DFA using Arden’s Theorem-
Step-02:
Solution- Bring final state in the form R = Q + RP.
Step-01: Using (1) in (2), we get-
Form a equation for each state- B = (∈ + B.1).0
 A = ∈ + B.1 ……(1) B = ∈.0 + B.1.0
 B = A.0 ……(2) B = 0 + B.(1.0) ……(3)
Using Arden’s Theorem in (3), we get-
B = 0.(1.0)*
Thus, Regular Expression for the given
DFA = 0(10)*
26
Example 3
Find regular expression for the
following DFA using Arden’s
Theorem-
Step-02:
Solution- Bring final state in the form
Step-01: R = Q + RP.
Form a equation for each state- Using (1) in (2), we get-
 q1 = ∈ ……(1) q2 = ∈.a
 q2 = q1.a ……(2) q2 = a …….(4)
 q3 = q1.b + q2.a + q3.a …….(3) Using (1) and (4) in (3), we get-
q3 = q1.b + q2.a + q3.a
Using Arden’s Theorem in (5), we get- q3 = ∈.b + a.a + q3.a
q3 = (b + a.a)a* q3 = (b + a.a) + q3.a …….(5)
Thus, Regular Expression for the given
DFA = (b + aa)a* 27
Exercise
Construct the regular expression for the following FA
q3
State Elimination Method-
 This method involves the following steps in finding the regular
expression for any given DFA-
Thumb Rule
Step-01:
The initial state of the DFA must not have any incoming edge.
• If there exists any incoming edge to the initial state, then create a new
initial state having no incoming edge to it.
Example-
29
State Elimination Method…..
 Step-02: Thumb Rule
There must exist only one final state in the DFA.
• If there exists multiple final states in the DFA,
then convert all the final states into non-final
states and create a new single final state.
Example-
30
Thumb Rule
 Step-03:  The final state of the DFA must not have any outgoing
edge.
 If there exists any outgoing edge from the final state,
then create a new final state having no outgoing edge
from it.
Example-
31
Step-04:
 Eliminate all the intermediate states one by one.
 These states may be eliminated in any order.
In the end,
• Only an initial state going to the final state will be left.
• The cost of this transition is the required regular expression.
NOTE: The state elimination method can be applied to any finite automata.
(NFA, ∈-NFA, DFA etc)
32
Example 1
 Find regular expression for the following FA-
Solution-
Step-01:
 Initial state A has an incoming edge. Step-02:
 So, we create a new initial state qi.  Final state B has an outgoing
The resulting FA is- edge.
 So, we create a new final state
qf.
The resulting FA is-
33
Cont….
Step-03:
Now, we start eliminating the intermediate states.
First, let us eliminate state A.
 There is a path going from state qi to state B via state A.
 So, after eliminating state A, we put a direct path from state
qi to state B having cost ∈.0 = 0
 There is a loop on state B using state A.
 So, after eliminating state A, we put a direct loop on state B
having cost 1.0 = 10.
Eliminating state A, we get-
34
Cont…
Step-04:
Now, let us eliminate state B.
• There is a path going from state qi to state qf via state B.
• So, after eliminating state B, we put a direct path from state qi to state qf having
cost 0.(10)*.∈ = 0(10)*

Eliminating state B, we get-
From here, Regular Expression = 0(10)*
35
Example 2
DFA
Solution-
Step 01:
 There exist multiple final states.
 So, we convert them into a single final
state.
The resulting FA is
36
Cont…
Step-02:
First, let us eliminate state q4.
 There is a path going from state q2 to state qf via state q4.
 So, after eliminating state q4 , we put a direct path from
state q2 to state qf having cost b.∈ = b.
37
Cont…
Step-03:
Now, let us eliminate state q3.
There is a path going from state q2 to state qf via state q3.
So, after eliminating state q3 , we put a direct path from
state q2 to state qf having cost c.∈ = c.
38
Cont…
Step-04:
So, after eliminating state q5 , we put a direct path from state q2 to state
qf having cost d.∈ = d.
39
Cont…
Step-05:
So, after eliminating state q2 , we put a direct path from state q1 to state
qf having cost a.(b+c+d).
From here, Regular Expression = a(b+c+d)
40
Example 3

DFA-
Solution-
Step-01:
Initial state q1 has an incoming edge.
• So, we create a new initial state qi.
The resulting DFA is-
Step-02:
Final state q2 has an outgoing edge.
• So, we create a new final state qf.
The resulting DFA is-
41
Example3:
Step-03:
First, let us eliminate state q1.
There is a path going from state qi to state q2 via state q1 .
So, after eliminating state q1, we put a direct path from state qi to state q2 having
cost ∈.c*.a = c*a
There is a loop on state q2 using state q1 .
So, after eliminating state q1 , we put a direct loop on state q2 having cost b.c*.a =
bc*a
Eliminating state q1, we get-
42
Example3:
Step-04:

There is a path going from state qi to state qf via state q2 .
So, after eliminating state q2, we put a direct path from state qi to state qf having
cost c*a(d+bc*a)*∈ = c*a(d+bc*a)*

Eliminating state q2, we get-
From here, Regular Expression = c*a(d+bc*a)*
43
Exercises
Find regular expression for the following DFA-
44
RE to -NFA construction
(Thompson Construction )
Reg Ex  -NFA
Theorem 2
(0+1)*01(0+1)*
Example:
(0+1)* 01 (0+1)*
 
0 0
   
 0 1
 1
  1

 
45
Thompson Construction Method
The algorithm works recursively by splitting an expression into its
constituent sub expressions, from which the NFA will be constructed
using a set of rules.
Following are the rules :
If the operand is epsilon, then our FA has

two states, q (the start state) and F (the
final, accepting state), and an epsilon
transition from q to F.
If the operand is a character a, then our FA
has two states, q (the start state) and F (the
final, accepting state), and a transition
from q to F with label a.
46
Cont…
1. The union expression s/t converted to
State q goes via ε either to the initial

state of Automata of s and t . N(s) or N(t)
). Their final states become intermediate
states of the whole NFA and merge via
two ε-transitions into the final state of
the NFA.
Cont…
2.The Concatenation expression st for some smaller expression s
and t. The automation for the concatenation is ………….
The initial state of N(s) is the initial state
of the whole NFA. The final state of N(s)
R becomes
S the initial state of N(t). The final
state of N(t) is the final state of the whole
NFA.
3. TheKleene star expression is s* for some smaller expression s .
Then we use automation of ………. An ε-transition connects the initial and
final state of the NFA with the sub-NFA.
N(s) Automata for s in between Another ε-
transition from the inner final to the inner
initial state of N(s) allows for repetition of
expression s according to the star operator
Construction of an FA from an RE
(General and simplified)
We can use Thompson's Construction to find out a Finite Automaton from a

Regular Expression. We will moderate the regular expression into minimum
regular expressions and converting these to NFA and finally to DFA.
Case 1 − For a regular expression ‘a’, we can construct the following FA

Finite Automata for RE = a
Start a
q1 qf
Case 2 − For a regular expression ‘ab’, we can construct the following FA
−
Start q
a q
qf
b
1 2
Case 3 − For a regular expression (a+b), we can construct the following FA
−
Start q1
a qf
b
Case 4 − For a regular expression (a+b)*, we can construct the following FA −
a,b
Start qf
Example:-
Convert the following RE into its equivalent DFA − 1 (0 + 1)* 0

0,1
start q0 1 q1 ϵ q2 ϵ q3 0 qf
Cont…
Example 1: Find the automation for regular expression a.

(a+b)*.b.b
Solution:
The basic regular expression involved are a and b, we
start with automation for a and automation for b.
Since brackets are evaluated first.(a+b).
ϵ a ϵ
Start ϵ ϵ
b
Cont…
Step 2: Since closure is required to take next, we construct automation for
(a+b)* using automation for (a+b) ……..
ϵ
a
ϵ ϵ
Start ϵ ϵ
ϵ b ϵ
ϵ
Cont…
Step 3: Next we construct the automation for a.(a+b)* as………
ϵ a ϵ ϵ
Star a ϵ ϵ
ϵ b ϵ ϵ
Cont…
Step 4: Next we construct the automation for a.(a+b)*.b by

using automation
ϵ ϵ a ϵ
Start a ϵ ϵ b
ϵ b ϵ
ϵ
Cont…
Step 5: Now finally we can construct automation for a.
(a+b)*.b.b
ϵ ϵ a ϵ
Start a ϵ b b
ϵ ϵ
ϵ b ϵ
Algebraic Laws of Regular Expressions
 Commutative:  Distributive:
 E+F = F+E  E(F+G) = EF + EG
 Associative:  (F+G)E = FE+GE
 (E+F)+G = E+(F+G)  Idempotent: E + E = E

 (EF)G = E(FG)  Involving Kleene closures:
 Identity:  (E*)* = E*
 E+Φ = E  Φ* = 
  E = E  = E  * = 
 Annihilator:  E+ =EE*
 ΦE = EΦ = Φ  E? =  +E
57
True or False?
Let R and S be two regular expressions. Then:
1. ((R*)*)* = R* ?
2. (R+S)* = R* + S* ?
3. (RS + R)* RS = (RR*S)* ?
58
The Pumping Lemma for Regular
Languages
What it is?
The Pumping Lemma is a property of all regular
languages.
How is it used?
A technique that is used to show that a given language is
not regular
59
Pumping Lemma for Regular Languages
Let L be a regular language
 Then there exists some constant N such that for every
string w  L s.t. |w|≥N, there exists a way to break w into

three parts, w=xyz, such that:
1. y≠ 
2. |xy|≤N
3. For all k≥0, all strings of the form xykz  L
This property should hold for all regular languages.

Definition: N is called the “Pumping Lemma Constant”
60
Method to prove that a language
L is not regular
 At first, we have to assume that L is regular.
 So, the pumping lemma should hold for L.
 Use the pumping lemma to obtain a contradiction −
 Select w such that |w| ≥ c
 Select y such that |y| ≥ 1
 Select x such that |xy| ≤ c
 Assign the remaining string to z.
 Select k such that the resulting string is not in L.
61
Pumping Lemma: Proof
 L is regular => it should have a DFA.
 Set N := number of states in the DFA
 Any string wL, s.t. |w|≥N, should have the form:

w=a1a2…am, where m≥N
 Let the states traversed after reading the first N symbols
be: {p0,p1,… pN}
 ==> There are N+1 p-states, while there are only N
DFA states
 ==> at least one state has to repeat
i.e, pi= pJwhere 0≤i<j≤N (by PHP)
62
Pumping Lemma: Proof…
 => We should be able to break w=xyz as follows:
 x=a1a2..ai; y=ai+1ai+2..aJ; z=aJ+1aJ+2..am
 x’s path will be p0..pi
 y’s path will be pi pi+1..pJ (but pi=pJ implying a loop)
 z’s path will be pJpJ+1..pm yk (for k loops)
 Now consider another x z
p0 pi pm
string wk=xykz , where k≥0
=pj
 Case k=0
 DFA will reach the accept state pm
 Case k>0
 DFA will loop for yk, and finally reach the accept state pm for z
This proves part (3) of the lemma

In either case, wk L
63
Pumping Lemma: Proof…
 For part (1): yk (for k loops)
 Since i<j, y ≠  p0
x
pi
z
pm
=pj
 For part (2):
 By PHP, the repetition of states has to occur within
the first N symbols in w
 ==> |xy|≤N
64
Applications of Pumping Lemma
⸎ Pumping Lemma is to be applied to show that certain languages

are not regular. It should never be used to show a language is
regular.
 If L is regular, it satisfies Pumping Lemma.
 If L does not satisfy Pumping Lemma, it is non-regular.
65
Using the Pumping Lemma
Note: We don’t have any control over N, except that it is positive.
We also don’t have any control over how to split w=xyz,
but xyz should respect the P/L conditions (1) and (2).
 What WE do?  What the Adversary does?

3. Using N, we construct our template 1. Claims L is regular
string w
2. Provides N
4. Demonstrate to the adversary, either
through pumping up or down on w,
that some string wk  L
(this should happen regardless of
w=xyz)
66
Cont…
 B={0n1n: n≥0} is not regular
proof:
 Suppose B is regular
 Let P be the pumping length
 Choose s = 0P1P = 0…01…1∈B
 Let s = x yz, By pumping lemma, for any i ≥0
x yiz∈B. But

0…001…1 : y has 0 only ⇒ →←

0…01…1 : y has 1 only ⇒ →←

0…01…1 : y has both 0 and 1 ⇒ x yyz ∉ B
67
Cont…
 C={w | w has an equal number of 0s and 1s} is not
regular
proof: (By pumping lemma)

Let P be the pumping length, Suppose C is regular

Let s = 0P1P∈C, By pumping lemma, s can be split
into 3 pieces, s =x yz, and x yiz∈C for any i ≥0

By condition 3 in the lemma: |x y| ≤ P
Thus y must have only 0s.
Then x yyz ∉C
68
Cont…
 F={ww | w∈{0,1}* } is non-regular
proof:
 Suppose F is regular

Let P be the pumping length given by the pumping
lemma

Let s = 0P10P1∈F

Split s into 3 pieces, s =x yz

By condition 3 in the lemma: |x y| ≤ P
Thus y must have 0 only.
⇒ x yyz ∉ F 0…010…01 →← w y
69
Cont….
 E={0i1j : i >j } is non-regular
proof:
 Assume E is regular

Let P be the pumping length

Let s = 0P+11P∈E

Split s into 3 pieces, s =x yz

By pumping lemma: x yi z∈E for any i ≥ 0
|y |>0, y have 0 only. x z∈E.
But x z has #(0) ≤ #(1)
70
Cont…
n2
 D={1 : n ≥ 0} is not regular
proof:
 Assume D is regular

Let P be the pumping length
2

Let s = 1P ∈D

Split s into 3 pieces, s =x yz ⇒ x yiz∈D, i ≥ 0

Consider x yiz∈D and x yi+1z∈D
⇒|x yiz| and |x yi+1z| are perfect squre for any i ≥0
 If m=n2, (n+1)2 - n2 =2n+1 = 2 +1
71
Cont…

Let m=|x yiz|

|y| ≤ |s |= P2

Let i = P4
|y|= |x yi+1z|-|x yiz|
≤ P2 = (P4)1/2
< 2(P4)1/2+1
≤ 2(|x yiz|)1/2+1
=2 +1
→←
72
Example of using the Pumping Lemma to prove that a
language is not regular
Let Leq = {w | w is a binary string with equal number of 1s
and 0s}
 Your Claim: Leq is not regular
 Proof:
 adv.
 By contradiction, let Leq be regular
 P/L constant should exist  adv.
 Let N = that P/L constant
 you
 Consider input w = 0N1N
(your choice for the template string)
you
 By pumping lemma, we should be able to break w=xyz,
such that:
1) y≠ 
2) |xy|≤N
3) For all k≥0, the string xykz is also in L 73
Template string w = 0N1N = 00 …. 011 … 1
N N
Proof…
 Because |xy|≤N, xy should contain only 0s  you

(This and because y≠ , implies y=0+)
 Therefore x can contain at most N-1 0s
 Also, all the N 1s must be inside z
 By (3), any string of the form xykz  Leq for all k≥0
Setting k=0 is  Case k=0: xz has at most N-1 0s but has N 1s
referred to as
“pumping down”  Therefore, xy0z  Leq
 This violates the P/L (a contradiction)
Another way of proving this will be to show that if

Setting k>1 is the #0s is arbitrarily pumped up (e.g., k=2),
referred to as
“pumping up” then the #0s will become exceed the #1s
74
Example 3: Pumping Lemma
Claim: L = { 0i | i is a perfect square} is not regular
 Proof:
 By contradiction, let L be regular.
 P/L should apply
 Let N = P/L constant
 Choose w=0N2
 By pumping lemma, w=xyz satisfying all three rules
 By rules (1) & (2), y has between 1 and N 0s
 By rule (3), any string of the form xykz is also in L for all k≥0
 Case k=0:
 #zeros (xy0z) = #zeros (xyz) - #zeros (y)
 N – N ≤ #zeros (xy0z) ≤ N2 - 1
2
 (N-1)2 < N2 - N ≤ #zeros (xy0z) ≤ N2 - 1 < N2

 xy0z  L
 But the above will complete the proof ONLY IF N>1.
 … (proof contd.. Next slide)
75
Example 3: Pumping Lemma
 (proof contd…)
 If the adversary pick N=1, then (N-1)2 ≤ N2 – N, and therefore the
#zeros(xy0z) could end up being a perfect square!
 This means that pumping down (i.e., setting k=0) is not giving us the
proof!
 So lets try pumping up next…
 Case k=2:
 #zeros (xy2z) = #zeros (xyz) + #zeros (y)
 N2 + 1 ≤ #zeros (xy2z) ≤ N2 + N
 N2 < N2 + 1 ≤ #zeros (xy2z) ≤ N2 + N < (N+1)2
 xy2z  L
 (Notice that the above should hold for all possible N values of N>0.
Therefore, this completes the proof.) 76
Closure properties for Regular
Languages (RL) This is different
from Kleene
closure
 Closure property:
 If a set of regular languages are combined using an
operator, then the resulting language is also regular

 Regular languages are closed under:
 Union, intersection, complement, difference
 Reversal
 Kleene closure
 Concatenation
 Homomorphism Now, lets prove all of this!

 Inverse homomorphism
77
RLs are closed under union
 if L and M are two RLs then:
 they both have two corresponding regular expressions,
R and S respectively
 (L U M) can be represented using the regular
expression R+S
 Therefore, (L U M) is also regular
How can this be proved using FAs?
78
RLs are closed under
complementation
 If L is an RL over ∑, then L=∑*-L
 To show L is also regular, make the following construction
Convert every final state into non-final, and
every non-final state into a final state
DFA for L DFA for L

qF1 qF1
q0 qi qF2 q0 qi qF2
…
…
qFk qFk
Assumes q0 is a non-final state. If not, do the opposite.

79
RLs are closed under intersection
 A quick, indirect way to prove:
 By DeMorgan’s law:
 L ∩ M = (L U M)
 Since we know RLs are closed under union and
complementation, they are also closed under
intersection
 A more direct way would be construct a finite
automaton for L ∩ M
80
DFA construction for L ∩ M
 AL = DFA for L = {QL, ∑ , qL,FL, δL }
 AM = DFA for M = {QM, ∑ , qM,FM, δM }
 Build AL ∩ M = {QLx QM,∑, (qL,qM), FLx FM,δ} such
that:
 δ((p,q),a) = (δL(p,a), δM(q,a)), where p in QL, and q in
QM
 This construction ensures that a string w will be
accepted if and only if w reaches an accepting
state in both input DFAs.
81
DFA construction for L ∩ M
DFA for L DFA for M
qF1 pF1
a a
q0 qi qj qF2 p0 pi pj pF2
…
DFA for LM
(qF1 ,pF1)
a
(q0 ,p0) (qi ,pi) (qj ,pj)
…
82
RLs are closed under set
difference
Closed under intersection
 We observe: Closed under
 L-M=L∩M complementation
 Therefore, L - M is also regular
83
RLs are closed under reversal
Reversal of a string w is denoted by wR
 E.g., w=00111, wR=11100
Reversal of a language:
 LR = The language generated by reversing all
strings in L
Theorem: If L is regular then LR is also regular
84
 -NFA Construction for LR
New -NFA for LR
DFA for L
qF1

a
q0 qi qj qF2  q’0 New start
state
Make the 
…
old start state
as the only new qFk
final state
What to do if q0 was Reverse all transitions

one of the final states Convert the old set of final states
in the input DFA? into non-final states 85
If L is regular, LR is regular (proof using
regular expressions)
 Let E be a regular expression for L
 Given E, how to build ER?
 Basis: If E= , Ø, or a, then ER=E
 Induction: Every part of E (refer to the part as “F”) can
be in only one of the three following forms:
1. F = F1+F2

FR = F1R+F2R
2. F = F1F2

FR = F2RF1R
3. F = (F1)*

(FR)* = (F1R)*
86
AUTOMATA WITH OUTPUT
Here we are using two Machines for finding the Finite Automata Output
(i) Moore Machine
(ii) Mealy Machine

Moore Machine
(i) Moore Machine
Moore machine is an FSM whose outputs depend on only the present
state.
A Moore machine can be described by a 6 tuple (Q, ∑, ∆, δ, ƛ’,
q0) where −
Q is a finite set of states.
∑ is a finite set of symbols called the input alphabet.
∆ is a finite set of symbols called the output alphabet.
δ is the input transition function where δ: Q × Σ → Q
ƛ’ is the output transition function where ƛ’ : Q × Σ → ∆
q0 is the initial state from where any input is processed (q0 ∈ Q).
Representation of Moore Machine:
( Transition Table)
Present State Next state at input Output
a=0 a=1
q0 q3 q1 0
q1 q1 q2 1
q2 q2 q3 0
q3 q3 q0 0
Representation of Moore Machine:( Transition
Diagram)
0
0
Start q0 1 q1 1
0
1
0 1 0
q2 q3
0 1 0
Mealy Machine
A Mealy Machine is an FSM whose output depends on the
present state as
well as the present input.
It can be described by a 6 tuple (Q, ∑, ∆, δ, ƛ’, q0) where −
Q is a finite set of states.
∑ is a finite set of symbols called the input alphabet.
∆ is a finite set of symbols called the output
alphabet. δ is the input transition function
where δ: Q × ∑ → Q ƛ’ is the output transition
function where X: Q → ƛ’,
q0 is the initial state from where any input is
Representation of Mealy Machine:( Transition Table)
Present State For input a=0 for input a=1

State Output State Output
q1 q3 0 q2 0
q2 q1 1 q4 0
q3 q2 1 q1 1
q4 q4 1 q3 0
Representation of Mealy Machine:
( Transition Diagram)
Start q1 0/1 q2
1/0
1/1 0/1 1/0

0/0
q3 q4
0/1
1/0
Mealy Machine vs. Moore Machine
Mealy Machine Moore Machine

Output depends both upon Output depends only upon
present state and present input. the present state.
Generally, it has fewer states Generally, it has more states
than Moore Machine. than Mealy Machine.
Output changes at the clock edges. Input change can cause
change in output change as
soon as logic is done.
Mealy machines react faster In Moore machines, more logic
to inputs. is needed to decode the outputs
since it has more circuit delays.
Summary
 Regular expressions
 Equivalence to finite automata
 DFA to regular expression conversion
 Regular expression to -NFA conversion
 Algebraic laws of regular expressions
 How to prove languages are not regular?
 Pumping lemma & its applications
 Closure properties of regular languages.
95

Chapter 2 RegularExpressions

Uploaded by

Copyright:

Available Formats

Chapter 2 RegularExpressions

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 2 RegularExpressions

Uploaded by

Copyright:

Available Formats

Regular Expressions

Formal language classes

 Note: A union of two languages produces a third

 i.e., LM is same as L.M

Kleene Closure (the * operator)

 L* is an infinite set iff |L|≥1 and L≠{} Why?

Σ* denotes the set of all words over an alphabet Σ

an arbitrary language L over an alphabet Σ is:

 L(E + F) = L(E) U L(F)

 L(E F) = L(E) L(F)

Since L is the union of all 4 cases:

Write the regular expression for the

Write the regular expression for the

 Convert a DFA to a regular expression

Proofs expression R such that L(R)=L(A)

(1*) 0 (0*) 1 (0 + 1)*

Q) What is the language?

In order to find out a regular expression of a Finite Automaton, we use

qn = q1R1n + q2R2n + … + qnRnn

Construct a regular expression

Eliminating state A, we get-

From here, Regular Expression = a(b+c+d)

Find regular expression for the following

From here, Regular Expression = c*a(d+bc*a)*

If the operand is epsilon, then our FA has

State q goes via ε either to the initial

We can use Thompson's Construction to find out a Finite Automaton from a

Case 1 − For a regular expression ‘a’, we can construct the following FA

Case 4 − For a regular expression (a+b)*, we can construct the following FA −

Convert the following RE into its equivalent DFA − 1 (0 + 1)* 0

Example 1: Find the automation for regular expression a.

Step 3: Next we construct the automation for a.(a+b)* as………

Step 4: Next we construct the automation for a.(a+b)*.b by

 Associative:  (F+G)E = FE+GE

 (E+F)+G = E+(F+G)  Idempotent: E + E = E

3. (RS + R)* RS = (RR*S)* ?

string w  L s.t. |w|≥N, there exists a way to break w into

This property should hold for all regular languages.

 Any string wL, s.t. |w|≥N, should have the form:

⸎ Pumping Lemma is to be applied to show that certain languages

 What WE do?  What the Adversary does?

Another way of proving this will be to show that if

 (N-1)2 < N2 - N ≤ #zeros (xy0z) ≤ N2 - 1 < N2

operator, then the resulting language is also regular

 Homomorphism Now, lets prove all of this!

How can this be proved using FAs?

DFA for L DFA for L

Assumes q0 is a non-final state. If not, do the opposite.

 Therefore, L - M is also regular

Theorem: If L is regular then LR is also regular

What to do if q0 was Reverse all transitions

(i) Moore Machine

(ii) Mealy Machine

Present State For input a=0 for input a=1

1/1 0/1 1/0

Mealy Machine Moore Machine

 Closure properties of regular languages.

You might also like

(1) 0 (0) 1 (0 + 1)*

From here, Regular Expression = ca(d+bca)*

3. (RS + R)* RS = (RRS) ?