Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views57 pages

TOC-MOD1-1

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 57

Dr Jasmine Selvakumari Jeya I

Senior Associate Professor


School of Computing Science and Engineering
VIT Bhopal University
jasmineselvakumarijeya@vitbhopal.ac.in
CSE2004 - THEORY OF COMPUTATION AND
COMPILER DESIGN
UNIT I (16 Hrs) Credits: 04
Basic concepts – Theorem proving – Finite automata: NFA, DFA, € - NFA,
Regular expressions - Equivalence between FA and RE – Minimization –
Decision properties – Pumping lemma for Regular Languages.
Specification of tokens – FA and RE to represent token formats – LEX.
Problems: Design of FA – Inter-conversion between RE and FA – Proving
languages to be not regular, Design approach of Lexical Analyzer for a
given token – LEX program to recognize tokens.
UNIT II (14 Hrs)
Context Free Grammar – Derivations – Parse trees – Ambiguity –
Chomsky Normal Form – Griebach Normal Form – Pushdown Automata
– DPDA & NPDA – Decision properties – Pumping lemma for CFL.
Problems: Design of CFG – Conversion from CFG to CNF, GNF – Design
of PDA – Inter-conversion between PDA & CFG – Proving languages to
be not context-free.
UNIT III (12 Hrs)
Parsing – Top-down Parsing – Predictive Parsing - Bottom up parsing –
SLR, CLR and LALR Parsing – YACC.
Problems: Design of Top-down parser and bottom-up parser to
illustrate syntax validation of an input string.
UNIT IV (10 Hrs)
Turing machines – TM as a computation model – TM as a recognizer – TM
with multiple tapes – Other models of TM – Linear Bounded Automata –
Chomsky Hierarchy of languages – Undecidability – Recursive and non –
recursive languages – Examples.
Problems: Design of TM – Design of LBA – Identification of Undecidability.
UNIT V (8 Hrs)
Three Address Codes – Code optimization techniques – Code generation.
Problems: Conversion from parse tree to TAC – optimization techniques –
Code generation.
Total Lectures: 60 Hrs
Course Outcomes (CO)
Students will be able to
CO1:Design finite automaton for different regular expressions and languages
and its applications in lexical analysis [KL3].
CO2:Build a simplified context-free grammar for a context-free language to
recognize by a Pushdown automation [KL3].
CO3:Demonstrate the syntax analysis process using a top-down and bottom-
up parser [KL3].
CO4:Develop a Computational model using Turing machine to test
decidability of a problem [KL3].
CO5:Develop the intermediate code representations and optimize them for
code generation [KL3].
Text Books
1. John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman, “Introduction to
Automata Theory, Languages and Computation”, 3rd Edition, Pearson
Education, 2014.
2. Alfred V. Aho, Monica S Lam, Ravi Sethi, Jeffery D Ullman, “Compilers:
Principles, Techniques, and Tools”, 2nd Edition, Pearson Education,
2015.
Reference Book
1. Michael Sipser, “Introduction to the Theory of Computation”, 2nd
Edition, Wadsworth Publishing Co Inc, 3rd Edition, 2012.
Assessment and Evaluation Mark Components

Type of Course (LT) Marks/ Weightage


Mid-Term Examination (50 Marks) 30
TEE (100 Marks) 30
Attendance 5
Class Assessment 25
Assignments 10
Total Marks 100
Introduction to TOC
✓TOC is a branch of computer science that deals with how efficiently
problem can be solved on a model of computation using an algorithm.
✓Theory of Computation divided into three branches:
Theory of Computation

Automata Theory &


Computability Theory Complexity Theory
Languages
Automata Theory and Languages
✓It deals with the definition and properties of various
mathematical model of computation.

Finite Automata

Examples Context Free


Grammar

Turing Machine
Computability Theory
✓Develop formal mathematical model of computation that reflect real
world computers.
Complexity Theory
✓It group the computable problem based on their hardness.

Purpose of TOC
✓It deals with what can and can not be computed by the model.
What is Computation?
✓Computation is a sequence of steps that can be performed by
computer.

Input x Output y = f(x)


Function

✓Computation is executing an algorithm i.e, it involves taking some


inputs and performing required operation on it to produce an output.
• TOC Suggests various abstract models of Computation, represented
mathematically.
• Computer which performs computation are not actual computers, they
are abstract machines.
• Our focus is on abstract machines that can be defined mathematically.
Examples of Abstract Machines
1. Turing machines (Powerful as real computers) – Universal Model
2. Finite Automata (Simple) – Restricted model
Applications of TOC
1. Compiler Design
2. Robotics
3. Artificial Intelligence
4. Knowledge Engineering
Basic Mathematical Notation
1. Symbol
Symbol is a character.
Eg: a,b,c,….z
0,1,2,….,9
+,-,*,%,…..(Special Characters)
2. Alphabet
✓An alphabet is a finite, non empty set of symbol.
✓It is denoted by sigma “∑”.
Eg: ∑= { 0,1} – set of binary alphabet
∑= { a,b,c,….z} – set of all lower case letters
∑= { +,-,/,*,%,….} – set of all special characters
{ 0,1} Set
3. String or Word
• A string is a finite set of sequence of symbols chosen from some alphabet.
Eg: 010001000 is a string or word from binary alphabet ∑= { 0,1}
aabbaabac is a string or word from alphabet ∑= { a,b,c}
Symbol Alphabet String or word

b,0,1 { 0,1} 00010111


{ a,b,…} baacbbd { a,b,c,d}
Operations of Strings
1. Empty or Null String
• The empty string is the string with zero occurrences of symbols
(no symbols).
• It is denoted by epsilon “ϵ”. No Symbols
Empty Set
{ }

2. Length of a String
• It is the number of symbols in the string or word.
• It is denoted by |w| Mod W
Eg: w=11010011 taken from binary alphabet represented by ∑= { 0,1}
Length of the string |w|= 8
• If w=abccd then |w|= 5
• |ϵ|= 0
3. Concatenation of String
• Concatenation means join two or more string.
• Let w=a1,a2,a3,……an
v=b1,b2,b3,……bm
then wv=a1,a2,a3,……an b1,b2,b3,……bm
Example:
(i) x=010 y=1
Concatenation of xy=0101 yx=1010
(ii) x=AL y=GOL
Concatenation of xy = ALGOL
(iii)Empty string is the identity element for concatenation operator
ie. wϵ = ϵw =w
4. Power of Alphabet
• If “∑” is an alphabet, we can express set of all string of certain length from
that alphabet by using exponential notation.
• It is denoted by ∑k is the set of strings of length k.
Example:
If ∑= { 0,1} then
∑0 = {ϵ} empty string
∑1 = {0,1} set of all empty strings of length one over ∑ = {0,1}
(21 =2) k=1
∑2 = {00,01,10,11} set of all strings of length two over ∑ = {0,1}
(22 =4) k=2
∑3 = {000,001,010,011,100,101,110,111} set of all strings of length
three over ∑ = {0,1}
(23 =8) k=3
5. Kleene Closure
• The set of string over an alphabet ∑ is usually denoted by ∑*
• For instance ∑*= {0,1} *
= {ϵ ,0,1,00,01,10,11,…} Including ϵ
Kleene Closure ∑*=∑0 U∑1 U∑2 U∑3 …….
6. Kleene Plus
• The set of strings over an alphabet ∑ excluding ϵ is usually denoted by
∑+
• For instance ∑+= {0,1} +
= {0,1,00,01,10,11,…} Excluding ϵ
∴ ∑+=∑* - {ϵ}
(or)
∑+=∑1 U∑2 U∑3 ……. Without ϵ symbol
Power of Alphabet

Kleene Closure ∑* Kleene Plus ∑+


LANGUAGES
• Finite set of non empty string.
• If ∑ is an alphabet and L ⊆ ∑* , then L is a language.
• ⊆ - “is a subset of”
Example:
1. English: It is a language over ∑= {a,b,c,…z}
2. Binary String: {0,1,01,10,0101,…} is a language over ∑ = {0,1}
Operations on Language
(i) Complementation
• Let L be a Language over an alphabet ∑.
• It is denoted by
= ∑*- L
(ii) Union
• Let L1 and L2 be Language over an alphabet ∑.
• The union of L1 and L2 is denoted by L1U L2 is {x|x is in L1 or L2 }
(iii) Intersection
• Let L1 and L2 be Language over an alphabet ∑.
• The intersection of L1 and L2 is denoted by
L1∩ L2 is {x|x is in L1 or L2 }
(iv) Concatenation
• Let L1 and L2 be Language over an alphabet ∑.
• The concatenation of L1 and L2 is denoted by
L1. L2 is {w1 . w2| w1 is in L1 & w2 is in L2}
(v) Reversal
• Let L be a Language over an alphabet ∑.
• The reversal of L denoted by
Lr ,is {wr | w is in L}
(vi) Kleene’s Closure
• Let L be a Language over an alphabet ∑.
• The Kleene’s Closure of L denoted by
L* , is {x| for an integer n ≥0}
x= x1x2….. xn & x1x2….. xn are in L
L* = Eg: a* = {ϵ ,a,aa,aaa,…}

(viI) Positive Closure


• Let L be a Language over an alphabet ∑.
• The Positive Closure of L denoted by
L+ , is {x| for an integer n ≥1}
x= x1x2….. xn & x1x2….. xn are in L
L+ = Eg: a+ = {a,aa,aaa,…}
Examples
1. Language of all strings consisting of n 0’s followed by n 1’s for some
n>=0.
L = {ϵ ,01,0011,000111,…}
2. Set of strings of 0’s and 1’s with equal number of each
L = {ϵ ,01,10,0011,1100,0101,1010,000111,…}
3. Set of binary numbers whose value is a prime
L = {10,11,101,111,1011,…} 2,3,5,7,11,13,…
4. Empty Language
The language has no string. It is denoted by Ø.
Finite Automata (FA)
• Finite Automata recognises regular Languages only.
• It was developed by “Scott Robin” in 1950 as a model of
a computer with limited memory.
• It receives its input as a string, usually from an input
tape. It delivers no output at all except an indication of
whether the input is acceptable (or) not.
• Hence used for decision making problems.
Applications of FA
1. It is an useful tool in the design of Lexical Analyser – a
part of compiler that groups characters into tokens,
indivisible units such as variable name and keyword.
2. Text Editor
3. Pattern Matching
4. File Searching Program
5. Text Processing (Searching an occurrence of one string
in a file).
Limitations
• It can recognise only simple languages (Regular).
• FA can be designed only for decision making problems.
Examples
1. Finite automation for an On/Off switch – Digital Systems
Push

Start
Off On

Push

2. Lexical analysis – Recognising a string “then”


Start h e n
t th the then
Finite Automata Systems
The Finite Automaton(FA) is a mathematical model of a system, with
discrete inputs and outputs and a finite number of memory configuration
called states and a set of transitions from state to state that occurs on input
symbols from alphabet ∑.
Examples of FA Models are:
❖ Software for designing digital circuits-silicon compilers.
❖Lexical analyser of a compiler.
❖Searching for keywords in a file (or) on the web – Text editors.
Representing a Finite Automata
❖Graphical (Transition Diagrams or Transition Table)
❖Tabular (Transition Table)
❖Mathematical (Transition Function or Mapping Function)
Types of Finite Automata
Two types – both describe what are called regular languages
1. Deterministic Finite Automata (DFA) – There is a fixed number of
states and can only be in one state at a time.
a a
Start b
q0 q1

2. Nondeterministic Finite Automata (NFA) – There is a fixed number


of states but can be in multiple states at one time.
a a
Start a, b
q0 q1
a
Deterministic Finite Automata (DFA)
Basic DFA:
Input File a b a a b b a b a b

Read Head
A Finite state
Finite Control Automata is the
simplest and most
restricted model of a
computer
DFA is a Language recognizer that has:
1. An input File – Containing an input string. Storage
2. A Finite Control – a device that can be in a finite number of states.
3. A reader – a sequential reading device.
4. A Program
How DFA works?
1. Initialisation: Reader (read head) should be over the leftmost
symbol. Finite control is in start state.
2. Single Step: Reader reads current symbol then, reader moves to the
next symbol to the right.
Control enters a new state that depends on the current state and
current symbols. There may be no desired next state, in which case the
machine stops. The machine repeats this action.
3. No Current Symbols: All symbols have been read then, if control is in
final State, the input string is accepted. Otherwise, the input string is
not accepted.
DFA Specification
A DFA is defined by the 5-tuple:
M =(Q, Σ, S, F, δ ), Where
Q ==> a finite, non empty set of states.
Σ ==> a finite, non empty set of input symbols (alphabet).
S ϵ Q ==> a start state
F ⊆ Q ==> a set of final or accepting states subset of Q.
δ ==> a transition function, defined as δ: Q x Σ ==> Q

Current State Current Symbol Next State


P σ δ (P, σ)
Notations for DFA
• The transitions are represented in the form of Notations are:
• Transition Table
Which is a tabular listing of the δ function
• Transition Diagram
Which is graph
Transition Diagram
“Transition Diagram” associated with DFA is a directed graph whose
vertices corresponding states of DFA. The edges are the transitions from
one state to another.
0 1 1
Start 1 0
q0 q1 q2
Starting State
or Final State
Initial State or
Accepting State
{0,1} are Inputs
q0 is a initial State
q1 is a Intermediate State
q2 is a Final State
Transition Table
▪It is basically a tabular representation of the transition
function that takes two arguments (a state and a symbol)
and returns a value (the “next state”).
Inputs
1 States
0 1
0 1
Start 1 q0 q0 q1
0
q0 q1 q2 q1 q2 q1
q2 Ø q2
*
Example
Transition Table of DFA
DFA is specified by M =(Q, Σ, S, F, δ ), Where Transition Diagram of DFA
Q ={q0, q1}, a b
S=q0, a
F={q1}, Start
Σ ={a, b} and δ is given by q0 q1

Inputs
States Note:
a b
• Start State ‘s’ is represented by --->
q0 q0 q1
• Final state are represented by * or
*
q1 q1 q1 double circle
Transition Function
✓The mapping function or transition function denoted by “delta δ”.
✓The two parameters are passed to this transition function.
✓Current State
✓Input Symbol
✓The transition function always return a state. Which can be called as
next state.
δ(Current_State, Current_Input_Symbol) = Next_State

Example: Q X Σ -> Q a
δ(q0, a) = q0 a
δ(q0, b) = q1 Start b
q0 q1
δ(q1, a) = q1
Properties of Transition Function (δ)
a. δ(q,ϵ) = q
This means the state of the system can be changed only by an input
symbol else remains in original state.
b. For all strings w and input symbol a
δ(q,aw) = δ(δ(q,a),w)
Similarly
δ(q,wa) = δ(δ(q,w),a)
c. The transition function δ can be extended to δ (or) δ that operates on states
and strings.
Basic: δ (q,ϵ) = q
Induction: δ(q,xa) = δ(δ(q,x),a)
Language of a DFA
A string x is said to be accepted by DFA M =(Q, Σ, S, F, δ ), If δ(q0, x)
= P for some P in F.
Method: A finite automata accepts a string w = a1,a2,a3,……an if
there is a path in the transition diagram which begins at a start
state ends at an accepted state with the sequence of labels
a1,a2,a3,……an.
❖The language accepted by finite automata (A) is
L(A) = { w: δ (q0,w) ϵ F} where F is a final state.
❖The language accepted by finite automata’s are called “regular
Language”.
Definition of NFA
The NFA is defined by the 5-tuple:
M =(Q, Σ, S, F, δ ), Where
Q ==>Finite non empty set of states.
Σ ==>Finite set of input symbols (alphabet).
S ϵ Q ==>Start state, belongs to Q.
F ⊆ Q ==>Set of final or accepting states, subset of Q.
δ ==> a transition function, defined as δ: Q x Σ ==> 2Q
Extended Transition Function (δ)

Basic: δ (q,ϵ) = {q}


Induction: δ(q,wa) = U δ(P,a) for each wϵ∑*, aϵ∑ and P ϵ δ (q,w)
Pϵδ(q,w)
Language of a NFA

Language accepted by NFA is


L(A) = {w: δ (q0,w) ∩ F} ≠ ∅ }
Example Problems
Solution

You might also like