TOC Notes
TOC Notes
TOC Notes
What is TOC?
In theoretical computer science, the theory of computation is the branch that deals
with whether and how efficiently problems can be solved on a model of computation,
using an algorithm. The field is divided into three major branches: automata theory,
computability theory and computational complexity theory. In order to perform a
rigorous study of computation, computer scientists work with a mathematical
abstraction of computers called a model of computation. There are several models
in use, but the most commonly examined is the Turing machine.
Automata theory:
In theoretical computer science, automata theory is the study of abstract machines
(or more appropriately, abstract 'mathematical' machines or systems) and the
computational problems that can be solved using these machines. These abstract
machines are called automata. This automaton consists of
states (represented in the figure by circles),
and transitions (represented by arrows).
As the automaton sees a symbol of input, it makes a transition (or jump) to another
state, according to its transition function (which takes the current state and the
recent symbol as its inputs).
Uses of Automata: compiler design and parsing
Alphabets, Strings and Languages
Languages :
A general definition of language must cover a variety of distinct categories: natural languages,
programming languages, mathematical languages, etc. The notion of natural languages like
English, Hindi, etc. is familiar to us. Informally, language can be defined as a system suitable for
expression of certain ideas, facts, or concepts, which includes a set of symbols and rules to
manipulate these. The languages we consider for our discussion is an abstraction of natural
languages. That is, our focus here is on formal languages that need precise and formal
definitions. Programming languages belong to this category. We start with some basic concepts
and definitions required in this regard.
Symbols :
Symbols are indivisible objects or entity that cannot be defined. That is, symbols are the atoms of
the world of languages. A symbol is any single object such as , a, 0, 1, #, begin, or do.
Usually, characters from a typical keyboard are only used as symbols.
Alphabets :
An alphabet is a finite, nonempty set of symbols. The alphabet of a language is normally denoted
by . When more than one alphabets are considered for discussion, then subscripts may be
used (e.g.
Example :
Strings or Words over Alphabet :
A string or word over an alphabet
Example : 0110, 11, 001 are three strings over the binary alphabet { 0, 1 } .
Length of a string :
The number of symbols in a string w is called its length, denoted by |w|.
(symbols) and
are strings.
and
Example : Consider the string 011 over the binary alphabet. All the prefixes, suffixes and
substrings of this string are listed below.
Prefixes: e, 0, 01, 011.
Suffixes: e, 1, 11, 011.
Substrings: e, 0, 1, 01, 11, 011.
Note that x is a prefix (suffix or substring) to x, for any string x and e is a prefix (suffix or
substring) to any string.
A string x is a proper prefix (suffix) of string y if x is a prefix (suffix) of y and x y.
In the above example, all prefixes except 011 are proper prefixes.
, we use
as
= e, if n = 0 ; otherwise
= 011011011,
= 011 and
Powers of Alphabets :
We write
(for some integer k) to denote the set of strings of length k with symbols from
other words,
= { w | w is a string over
. In
is denoted by
. That is,
The set
contains all the strings that can be generated by iteratively concatenating symbols
from
any number of times.
Example : If
= { a, b }, then
Note that
is denoted by
. That is,
Reversal :
For any string
is a language
Example :
. Note that,
. Because the language F does not
contain any string but {e} contains one string of length zero.
The set of all strings over { 0, 1 } containing equal number of 0's and 1's.
The set of all strings over {a, b, c} that starts with a.
Convention : Capital letters A, B, C, L, etc. with or without subscripts are normally used to
denote languages.
Set operations on languages : Since languages are set of strings we can apply set
operations to languages. Here are some simple examples (though there is nothing new in it).
Union : A string
iff
or
Reversal of a language :
The reversal of a language L, denoted as
, is defined as:
Example :
1.
2.
| n is an integer }. Then
= {
| n is an integer }.
= { xy |
and
}.
and
is defined as
1.
2.
in general.
3.
Iterated concatenation of languages : Since we can concatenate two languages, we also
repeat this to concatenate any number of languages. Or we can concatenate a language with
itself any number of times. The operation
This is defined formally as follows:
is
defined as follows :
= ( Union n in N )
=
= { x | x is the concatenation of zero or more strings from L }
Thus
is the set of all strings derivable by any number of concatenations of strings in L. It is also
useful to define
=
=
=
= {e}
{a, ab}
=
= {a, ab}
Note : e is in
The most important feature of the automaton is its control unit, which can be in any one of
a finite number of interval states at any point. It can change state in some defined manner
determined by a transition function.
At any point of time the automaton is in some integral state and is reading a particular
symbol from the input tape by using the mechanism for reading input. In the next time step
the automaton then moves to some other integral (or remain in the same state) as defined
by the transition function. The transition function is based on the current state, input symbol
read, and the content of the temporary storage. At the same time the content of the storage
may be changed and the input read may be modifed. The automation may also produce
some output during this transition. The internal state, input and the content of storage at
any point defines the configuration of the automaton at that point. The transition from one
configuration to the next ( as defined by the transition function) is called a move. Finite
state machine or Finite Automationis the simplest type of abstract machine we consider.
Any system that is at any point of time in one of a finite number of interval state and moves
among these states in a defined manner in response to some input, can be modeled by a
finite automaton. It doesnot have any temporary storage and hence a restricted model of
computation.
Grammar
A grammar is a mechanism used for describing languages. This is one of the most simple
but yet powerful mechanism. There are other notions to do the same, of course.
In everyday language, like English, we have a set of symbols (alphabet), a set of words
constructed from these symbols, and a set of rules using which we can group the words to
construct meaningful sentences. The grammar for English tells us what are the words in it
and the rules to construct sentences. It also tells us whether a particular sentence is wellformed (as per the grammar) or not. But even if one follows the rules of the english
grammar it may lead to some sentences which are not meaningful at all, because of
impreciseness and ambiguities involved in the language. In english grammar we use many
other higher level constructs like noun-phrase, verb-phrase, article, noun, predicate, verb
etc. A typical rule can be defined as
< sentence > < noun-phrase > < predicate >
meaning that "a sentence can be constructed using a 'noun-phrase' followed by a
predicate".
is
special
and
(or
variable)
called
the
start
is
denoted
symbol,
The
binary
i.e.
In
non-terminal
relation
defined
iff
other
where
words,
by
the
set
of
production
rules
by
.
P
is
finite
set
of
production
rules
of
the
form
and
, where N = {S},
ab, S
aSb}
Some terminal strings generated by this grammar together with their derivation is given below.
ab
aSb
aabb
aSb
aaSbb
aaabbb
Finite Automata
Finite Automata
Automata (singular : automation) are a particularly simple, but useful, model of computation. They
were initially proposed as a simple model for the behavior of neurons. The concept of a finite
automaton appears to have arisen in the 1943 paper A logical calculus of the ideas immanent in
nervous activity", by Warren McCullock and Walter Pitts. In 1951 Kleene introduced regular
expressions to describe the behaviour of finite automata. He also proved the important theorem
saying that regular expressions exactly capture the behaviours of finite automata. In 1959, Dana
Scott and Michael Rabin introduced non-deterministic automata and showed the surprising
theorem that they are equivalent to deterministic automata. We will study these fundamental
results. Since those early years, the study of automata has continued to grow, showing that they
are indeed a fundamental idea in computing.
Finite Automata :
A finite automaton has: Finite set of states, with start/initial and accepting/final states; Transitions
from one state to another on reading a symbol from the input
A system containing only a finite number of states and transitions among them is called a finitestate transition system.
Finite-state transition systems can be modeled abstractly by a mathematical model called finite
automation.
We said that automata are a model of computation. That means that they are a simplified
abstraction of `the real thing'. So what gets abstracted away? One thing that disappears is any
notion of hardware or software. We merely deal with states and transitions between states. The
distinction between program and machine executing it disappears. One could say that an
automaton is the machine and the program. This makes automata relatively easy to implement in
either hardware or software. From the point of view of resource consumption, the essence of a
finite automaton is that it is a strictly finite model of computation. Everything in it is of a fixed, finite
size andcannot be modified in the course of the computation.
An automaton processes a string on the tape by repeating the following actions until the tape
head has traversed the entire string:
1. The tape head reads the current tape cell and sends the symbol s found there to the
control. Then the tape head moves to the next cell.
2. he control takes s and the current state and consults the state transition function to get the
next state, which becomes the new current state.
Once the entire string has been processed, the state in which the automation enters is examined.
If it is an accept state , the input string is accepted ; otherwise, the string is rejected . Summarizing
all the above we can formulate the following formal definition:
tuple :
is a
Acceptance of Strings :
A DFA accepts a string
that
1.
2.
in Q such
3.
That is,
is the state the automation reaches when it starts from the state q and finish
processing the string w. Formally, we can give an inductive definition as follows:
The language of the DFA M is the set of strings that can take the start state to one of the accepting
states i.e.
L(M) = {
={
| M accepts w }
|
Example 1 :
It is a formal description of a DFA. But it is hard to comprehend. For ex. The language of the DFA
is any string over { 0, 1} having at least one 1
We can describe the same DFA by transition table or state transition diagram as
following:
Transition Table :
0
w/0 or in the i/p string. There can be any no. of 0's at the beginn
on label 0 indicates it ). Similarly there can be any no. of 0's & 1's in any order at the end o
Transition table :
It is basically a tabular representation of the transition function that takes two arguments (a state and a symbol
returns a value (the next state).
Here is an informal description how a DFA operates. An input to a DFA can be any string
. Put a pointer t
start state q. Read the input string w from left to right, one symbol at a time, moving the pointer according to the
transition function, . If the next symbol of w is a and the pointer is on state p, move the pointer to
. Wh
the end of the input string w is encountered, the pointer is on some state, r. The string is said to beaccepted by th
DFA if
A language
and rejected if