Machines and Their Languages (G51MAL) Lecture Notes Spring 2003
Machines and Their Languages (G51MAL) Lecture Notes Spring 2003
Lecture notes
Spring 2003
Thorsten Altenkirch
April 23, 2004
Contents
1 Introduction
1.1 Examples on syntax . . . . . .
1.2 What is this course about? . .
1.3 Applications . . . . . . . . . . .
1.4 History . . . . . . . . . . . . .
1.4.1 The Chomsky Hierarchy
1.4.2 Turing machines . . . .
1.5 Languages . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
4
4
5
5
6
6
2 Finite Automata
2.1 Deterministic finite automata . . . . . . .
2.1.1 What is a DFA? . . . . . . . . . .
2.1.2 The language of a DFA . . . . . .
2.2 Nondeterministic Finite Automata . . . .
2.2.1 What is an NFA? . . . . . . . . . .
2.2.2 The language accepted by an NFA
2.2.3 The subset construction . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
8
8
9
10
10
10
12
3 Regular expressions
3.1 What are regular expressions? . . . . . .
3.2 The meaning of regular expressions . . .
3.3 Translating regular expressions to NFAs
3.4 Summing up . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
16
18
25
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Pushdown Automata
6.1 What is a Pushdown Automaton? . . . . . . . .
6.2 How does a PDA work? . . . . . . . . . . . . . .
6.3 The language of a PDA . . . . . . . . . . . . . .
6.4 Deterministic PDAs . . . . . . . . . . . . . . . .
6.5 Context free grammars and push-down-automata
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
35
36
37
38
39
parser
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
41
42
43
46
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
47
47
50
51
52
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
30
30
30
31
33
34
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction
Most references refer to the course text [HMU01]. Please, note that the 2nd
edition is quite different to the first one which appeared in 1979 and is a classical
reference on the subject.
I have also been using [Sch99] for those notes, but note that this book is written
in german.
The online version of this text contains some hyperlinks to webpages which
contain additional information.
I hope that you are able to spot all the errors in the first program. It may be
actually suprising but the 2nd (strange looking) program is actually correct.
How do we know whether a program is syntactically correct? We would hope
that this doesnt depend on the compiler we are using.
1.2
1.1
Examples on syntax
Finite automata,
In PR1 and PR2 you are learning the language JAVA. Which of the following
programs are syntactically correct, i.e. will be accepted by the JAVA compiler
without error messages?
Pushdown automata,
Turing machines
2. How to specify formal languages?
Hello-World.java
Regular expressions
Context free grammars
1.3
Applications
Regular expressions
Regular expressions are a convenient ways to express patterns (i.e. for
search). There are a number of tools which use regular expressions:
A.java
class A {
class B {
void C () {
{} ; {{}}
}
}
1.4
1.4.1
History
The Chomsky Hierarchy
1.4.2
Turing machines
All languages
Type 0 or recursively enumerable languages
Decidable languages
Turing machines
Type 1 or context sensitive
languages
Type 2 or context free
languages
pushdown automata
Type 3 or
regular languages
finite automata
1.5
Languages
In this course we will use the terms language and word different than in everyday
language:
A language is a set of words.
A word is a sequence of symbols.
This leaves us with the question: what is a symbol? The answer is: anything,
but it has to come from an alphabet which is a finite set. A common (and
important) instance is = {0, 1}.
More mathematically we say: Given an alphabet we define the set as set
of words (or sequences) over : the empty word and given a symbol
x and a word w we can form a new word xw . These are all the
ways elements on can be constructed (this is called an inductive definition).
E.g. in the example = {0, 1}, typical elements of are 0010, 00000000,.
Note, that we only write if it apperas on its own, instead of 00 we just write
0. It is also important to realzie that although there are infinite many words,
each word has a finite length.
An important operation on is concatenation. Confusingly, this is denoted
by an invisible operator: given w, v we can construct a new word wv
simply by concatenating the two words. We can define this operation by
primitive recursion:
v
(xw)v
=
=
v
x(wv)
Finite Automata
2.1
2.1.1
q0
q1
q2
1
q0
q2
q2
The table represents the function , i.e. to find the value of (q, x) we have to
look at the row labelled q and the column labelled x. The initial state is marked
by an and all final states are marked by .
Yet another, optically more inspiring, alternative are transition diagrams:
1
q0
0
0
q1
q2
0,1
There is an arrow into the initial state and all final states are marked by double
rings. If (q, x) = q 0 then there is an arrow from state q to q 0 which is labelled
x.
We write for the set of words (i.e. sequences) over the alphabet . This
includes the empty word which is written . I.e.
{0, 1} = {, 0, 1, 00, 01, 10, 11, 000, . . . }
8
2.1.2
((q,
x), w)
=
=
=
=
=
=
=
by (2)
because D (q0 , 1) = q0
by (2)
because D (q0 , 0) = q1
by (2)
because D (q1 , 1) = q2
by (1)
2.2.1
(1)
0,1
(2)
Here xw stands for a non empty word whose first symbol is x and the rest is w.
E.g. if we are told that xw = 010 then this entails that x = 0 and w = 10. w
may be empty, i.e. xw = 0 entails x = 0 and w = .
As an example we calculate D (q0 , 101) = q1 :
D (q0 , 101)
2.2
q0
q1
0,1
q2
where C so given by
C
q0
q1
q2
0
1
{q0 } {q0 , q1 }
{q2 }
{q2 }
{}
{}
Note that we diverge he slightly from the definition in the book, which uses a
single initial state instead of a set of initial states. Doing so means that we can
avoid introducing -NFAs (see [HMU01], section 2.5).
2.2.2
marked in the previously). Thus we may have to use several marker but it may
also happen that all markers disappear (if no appropriate arrows exist). In this
case the word is not accepted. If at the end of the word any of the final states
has a marker on it then the word is accepted.
E.g. consider the word 100 (which is not accepted by C). Initially we have
w) is the set
We define P(Q) P(Q) with the intention that (S,
of states which is marked after having read w starting with the initial markers
given by S.
)
(S,
xw)
(S,
0,1
q0
q1
0,1
q2
After reading 1 we have to use two markers because there are two arrows from
q0 which are labelled 1:
0,1
q0
q1
0,1
q2
Now after reading 0 the automaton has still got two markers, one of them in an
accepting state:
0,1
q0
q1
0,1
q2
=
=
S
[
{(q, x) | q S}, w)
(
(3)
(4)
= C ({q0 , q1 }, 00)
S
= C ( {C (q, 0) | q {q0 , q1 }}, 0) by (4)
= C (C (q0 , 0) C (q1 , 0), 0)
= C ({q0 } {q2 }, 0)
= C ({q0 , q2 }, 0)
S
= C ( {C (q, 0) | q {q0 , q2 }}, ) by (4)
= C (C (q0 , 0) C (q2 , 0), 0)
= C ({q0 } {}, )
= {q0 }
by (3)
Using the extended transition function we define the language of an NFA as
w) F 6= {}}
L(A) = {w | (S,
However, after reading the 2nd 0 the second marker disappears because there
is no edge leaving q2 and we have:
0,1
q0
q1
0,1
q2
S
Actually, we may define by comprehension, which also extends the operation
to infinite sets of sets (although we dont need this here)
[
B = {x | A B.x A}
11
2.2.3
DFAs can be viewed as a special case of NFAs, i.e. those for which the the there
is precisely one start state S = {q0 } and the transition function returns always
one-element sets (i.e. (q, x) = {q 0 } for all q Q and x ).
Below we show that for every NFA we can construct a DFA which accepts the
same language. This shows that NFAs arent more powerful as DFAs. However,
in some cases NFAs need a lot fewer states than the corresponding DFA and
they are easier to construct.
Given an NFA A = (Q, , , S, F ) we construct the DFA
D(A) = (P(Q), , D(A) , S, FD(A) )
where
D(A) (S, x) =
{(q, x) | q S}
FD(A) = {S QN | S F 6= {}}
The basic idea of this construction (the subset construction) is to define a DFA
whose states are sets of states of the NFA. A final state of the DFA is a set
12
which contains at least a final state of the NFA. The transitions just follow the
active set of markers, i.e. a state S P(QN ) corresponds to having markers on
all q S and when we follow the arrow labelled x we get the set of states which
are marked after reading x.
As an example let us consider the NFA C above. We construct a DFA D(C)
D(C) = (P({q0 , q1 , q2 }, {0, 1}, D(C) , {q0 }, FD(C) )
with D(C) given by
D(C)
{}
{q0 }
{q1 }
{q2 }
{q0 , q1 }
{q0 , q2 }
{q1 , q2 }
{q0 , q1 , q2 }
0
1
{}
{}
{q0 }
{q0 , q1 }
{q2 }
{q2 }
{}
{}
{q0 , q2 } {q0 , q1 , q2 }
{q0 }
{q0 , q1 }
{q2 }
{q2 }
{q0 , q2 } {q0 , q1 , q2 }
Lemma 2.1
The result of both functions are sets of states of the NFA A: for the left hand
side because the extended transition function on NFAs returns sets of states and
for the right hand side because the states of D(A) are sets of states of A.
Proof: We show this by induction over the length of the word w, lets write |w|
for the length of a word.
|w| = 0 Then w = and we have
D(A) (S, )
=
=
S
A (S, )
by (1)
by (3)
and FD(C) is the set of all the states marked with above,i.e.
=
=
=
=
by (2)
ind.hyp.
by (4)
L(A) = L(D(A))
0
{q0}
1
{q0,q1}
Proof:
{q0,q2}
0
1
{q1}
{q0,q1,q2}
0,1
{q1,q2}
{q2}
{}
0,1
0,1
0,1
we note that some of the states ({}, {q1 }, {q2 }, {q1 , q2 }) cannot be reached from
the initial state, which means that they can be omitted without changing the
language. Hence we obtain the following automaton:
0
1
Corollary 2.3 NFAs and DFAs recognize the same class of languages.
Proof: We have noticed that DFAs are just a special case of NFAs. On the
other hand the subset construction introduced above shows that for every NFA
we can find a DFA which recognizes the same language.
0
{q0}
w L(A)
Definition of L(A) for NFAs
A (S, w) FA 6= {}
Lemma 2.1
D (A)(S, w) FA 6= {}
Definition of FD (A)
D (A)(S, w) FD(A)
{q0,q1}
{q0,q2}
0
1
0
{q0,q1,q2}
We still have to convince ourselves that the DFA D(A) accepts the same language as the NFA A, i.e. we have to show that L(A) = L(D(A)).
As a lemma we show that the extended transition functions coincide:
13
14
Regular expressions
3.1
h(a + e)llo
a b
( + b)(ab) ( + a)
As in arithmetic they are some conventions how to read regular expressions:
binds stronger then sequence and +. E.g. we read ab as a(b ). We
have to use parentheses to enforce the other reading (ab) .
Sequencing binds stronger than +. E.g. we read ab + cd as (ab) + (bc).
To enforce another reading we have to use parentheses as in a(b + c)d.
3.2
We now know what regular expressions are but what do they mean?
For this purpose, we shall first define an operation on languages called the Kleene
star. Given a language L we define
L = {w0 w1 . . . wn1 | n N i < n.wi L}
Intuitively, L contains all the words which can be formed by concatenating an
arbitrary number of words in L. This includes the empty word since the number
may be 0.
As an example consider L = {a, ab} {a, b} :
2. is a regular expression.
w L v L = wv L
5. If E and F are regular expressions then EF (i.e. just one after the other)
is a regular expression.
2. L() = {}
3. L(x) = {x}
where x .
hallo
16
hallo + hello
15
a b
Let us introduce the following notation:
wi = |ww{z
. . . w}
i times
L() = {}
By 2.
hallo
L(h) = {h}
L(a) = {a}
I.e. L(a b ) is the set of all words which start with a (possibly empty)
sequence of as followed by a (possibly empty) sequence of bs.
Hence by 5:
L(ha) = {wv | w {h} v {a}}
= {ha}
( + b)(ab) ( + a)
Lets analyze the parts:
L(hallo) = {hallo}
L((ab) ) = {abi | i N}
L( + b) = {, b}
hallo + hello
From the previous point we know that:
Hence, we have
L(hallo) = {hallo}
L(hello) = {hello}
3.3
Theorem 3.1 For each regular expression E we can construct ab NFA N (E)
s.t. L(N (E)) = L(E), i.e. the automaton accepts the language described by the
regular expression.
Proof:
We do this again by induction on the syntax of regular expressions:
1. N ():
17
18
N(0)
which will reject everything (it has got no final states) and hence
N(x)
This automaton only accepts the word x, hence:
L(N (x)) = {x}
= L(x)
L(N ()) =
= L()
4. N (E + F ):
We merge the diagrams for N (E) and N (F ) into one:
2. N ():
N(E)
N(F)
N(E+F)
N( )
I.e. given
N (E) = (QE , , E , SE , FE )
N (F ) = (QF , , F , SF , FF )
This automaton accepts the empty word but rejects everything else, hence:
Now we use the disjoint union operation on sets (see the MCS lecture
19
In this diagram I only depicted one initial and one final state of each of
the automata although they may be several of them.
Here is how we construct N (EF ) from N (E) and N (F ):
N (E) = (QE , , E , SE , FE )
N (F ) = (QF , , F , SF , FF )
The states of N (EF ) are the disjoint union of the states of N (E)
and N (F ):
QEF = QE + QF
The transition function of N (EF ) contains all the transitions of N (E)
and N (F ) (as for N (E + F )) and for each state q of N (E) which has
a transition to a final state of N (E) we add a transition with the
same label to all the initial states of N (F ).
L(N (E + F )) = L(E + F )
we are allowed to assume that
L(N (E)) = L(E)
L(N (F )) = L(F )
The initial states of N (EF ) are the initial states of N (E), and the
initial states of N (F ) if there is an initial state of N (E) which is also
a final state.
FEF = {(1, q) | q FF }
We now set
N (EF ) = (QEF , , EF , SEF , ZEF )
5. N (EF ):
We want to put the two automata N (E) and N (F ) in series. We do this
by connecting the final states of N (E) with the initial states of N (F ) in
a way explained below.
N(E)
N(F)
6. N (E ):
We construct N (E ) from N (E) by merging initial and final states of
N (E) in a way similar to the previous construction and we add a new
state which is initial and final.
22
N(EF)
21
N(E)
N(E*)
Given
a
N (E) = (QE , , E , SE , FE )
we construct N (E ).
We add one extra state :
N(a)
QE = QE + {}
NE inherits all transitions form NE and for each state which has
an arrow to the final state labelled x we also add an arrow to all the
initial states labelled x.
SE = {(0, q) | q SE } {(1, )}
The final states of NE are the final states of NE and :
FE = {(0, q) | q FE } {(1, )}
We define
N (E ) = (QE , , E , SE , FE )
N(a*)
We claim that
L(N (E )) = {w0 w1 . . . wn1 | n N i < n.wi L(N (E))}
23
b
a
N(a*)
N(b*)
a
a
b
a
N(a*b*)
Now, you may observe that this automaton, though correct, is unnecessary
complicated, since we could have just used
3.4
Summing up . . .
From the previous section we know that a language given by regular expression is
also recognized by a NFA. What about the other way: Can a language recognized
by a finite automaton (DFA or NFA) also be described by a regular expression?
The answer is yes:
Theorem 3.2 (Theorem 3.4, page 91) Given a DFA A there is a regular
expression R(A) which recognizes the same language L(A) = L(R(A)).
We omit the proof (which can be found in the [HMU01] on pp.91-93). However,
we conclude:
Corollary 3.3 Given a language L the following is equivalent:
25
26
4.1
4.2
1 |y| |xy| n
Proof: For a regular language L there exists a DFA A s.t. L = L(A). Let us
assume that A has got n states. Now if A accepts a word w with |w| n it
must have visited a state q twice:
We choose q s.t. it is the first cycle, hence |xy| n. We also know that y is non
empty (otherwise there is no cycle).
Now, consider what happens if we feed a word of the form xy i z to the automaton,
i.e. s instead of y it contains an arbitrary number of repetitions of y, including
the case i = 0, i.e. y is just left out. The automaton has to accept all such
words and hence xy i z L
27
xyyz L
that is |xyyz| is a square. However we know that
n2 = |w|
= |xyz|
< |xyyz|
n2 + n
since |y| n
< n2 + 2n + 1
= (n + 1)2
To summarize we have
n2 < |xyyz| < (n + 1)2
28
That is |xyyz| lies between two subsequent squares. But then it cannot be a
square itself, and hence we have a contradiction to xyyz L.
We conclude L is not regular.
Given a word w we write wR for the word read backwards. I.e. abcR =
bca. Formally this can be defined as
R =
(xw)R = wR x
We use this to define the language of even length palindromes
Lpali = {wwR | w
I.e. for = {a, b} we have abba Lpali . Using the intuition that finite automata
can only use finite memory it should be clear that this language is not regular,
because one has to remember the first half of the word to check whether the
2nd half is the same word read backwards. Indeed, we can show:
Theorem 4.4 Given = {a, b} we have that Lpali is not regular.
Proof: We use the pumping lemma: We assume that Lpali is regular. Now
given a pumping number n we construct w = an bban Lpali , this word is
certainly longer than n. From the pumping lemma we know that there is a
splitting of the word w = xyz s.t. |xy| n and hence y may only contain 0s
and since y 6= at least one. We conclude that xz Lpali where xz = am bban
where m < n. However, this word cannot be a palindrome since only the first
half contains any a s.
Hence our assumption Lpali is regular must be wrong.
The proof works for any alphabet with at least 2 different symbols. However, if
contains only one symbol as in = {1} then Lpali is the language of an even
number of 1s and this is regular Lpali = (11) .
5.1
5.2
30
29
E+T
T +T
F +T
a+T
a+F
a + (E)
a + (T )
a + (T F )
a + (F F )
a + (a F )
a + (a a)
Note that G here stands for the relation derives in one step and has nothing
to do with implication. In the example we have always replaced the leftmost
non-terminal symbol (hence it is called a leftmost derivation) but this is not
necessary.
Given any grammar G = (V, , S, P ) we define the relation derives in one step
G (V T ) (V T )
V G : V P
The relation derives is defined as2
G (V T ) (V T )
0 G n : G 1 G . . . n1 G n
this includes the case G because n can be 0.
We now say that the language of a grammar L(G) is given by all words
(over ) which can be derived in any number of steps, i.e.
L(G) = {w | S G w}
A language which can be given by a context-free grammar is called a context-free
language (CFL).
5.3
NP V P
cat | dog
the N | N P which V P
barks | bites
bites | catches
V I | V T NP
More examples
Some of the languages which we have shown not to be regular are actually
context-free.
The language {0n 1n | n N} is given by the following grammar
G = ({S}, {0, 1}, S, {S | 0S1})
Also the language of palindromes
{wwR | w {a, b} }
2
G
31
32
5.4
Parse trees
With each derivation we also associate a derivation tree, which shows the structure of the derivation. As an example consider the tree associated with the
derivation of a + (a a) given before:
E
T
5.5
Ambiguity
This grammar is shorter and requires only one variable instead of 4. Moreover
it generates the same language, i.e. we have L(G) = L(G0 ). But it is ambigous:
Consider a + a a we have the following parse trees:
)
E
E E
T
a a
E
T
F
F
a
Each parse tree correspond to a different way to read the expression, i.e. the
first one corresponds to (a + a) a and the second one to a + (a a). Depending
on which one is chosen an expression like 2 + 2 3 may evaluate to 12 or to 8.
Informally, we agree that binds more than + and hence the 2nd reading is the
intended one.
This is actually achieved by the first grammar which only allows the 2nd reading:
E
a
The top of the tree (called its root) is labelled with start symbol, the other nodes
are labelled with nonterminal symbols and the leaves are labelled by terminal
symbols. The word which is derived can be read off from the leaves of the tree.
The important property of a parse tree is that the ancestors of an internal node
correspond to a production in the grammar.
T
F
F
a
34
33
F
a
Pushdown Automata
6.1
=
=
=
=
=
{(q0 , xz)}
{(q1 , z)}
{(q1 , )}
{(q2 , )}
{}
everywhere else
x,x,
x,z,xz
A transition function
Q ( {}) Pfin (Q )
Here Pfin (A) are the finite subsets of a set, i.e. this can be defined as
Pfin (A) = {X | X A Xis finite.}
q1
q2
,#,
An initial state q0 Q,
6.2
, z,z
=
=
=
=
=
=
=
=
=
=
=
=
=
{(q0 , 0#)}
{(q0 , 1#)}
{(q0 , 00)}
{(q0 , 10)}
{(q0 , 01)}
{(q0 , 11)}
{(q1 , #)}
{(q1 , 0)}
{(q1 , 1)}
{(q1 , )}
{(q1 , )}
{(q2 , )}
{}
everywhere else
35
`P 0
` P0
` P0
`P0
` P0
`P0
1.
1.
2.
1.
1.
2.
with
with
with
with
with
with
We have shown (q0 , 0110, #) `P0 (q1 , 110, 0#). Here the PDA gets stuck there
is no state after (q1 , 110, 0#).
If we start with a word which is not in the language L (like 0011) then the
automaton will always get stuck before reaching a final state.
6.3
6.4
Deterministic PDAs
=
=
=
=
=
{(q0 , xz)
x {0, 1}}
{(q1 , z)}
{(q1 , )}
{(q2 , )}
{}
everywhere else
x,x,
x,z,xz
q0
$,z,z
q1
q2
We can check that this automaton is deterministic. In particular the 3rd and
4th line cannot overlap because # is not an input symbol.
Different to PDAs in general the two acceptance methods are not equivalent for
DPDAs acceptance by final state makes it possible to define a bigger class of
langauges. Hence, we shall always use acceptance by final state for DPDAs.
We can always turn a PDA which use one acceptance method into one which uses
the other. Hence, both acceptance criteria specify the same class of languages.
38
37
,#,
6.5
S ... w
(q0 , w, S) ` ` (q0 , , )
we define
w L(P (G))
39
7.1
The basic idea of a recursive descent parser is to use the current input symbol
to decide which alternative to choose. Grammars which have the property that
it is possible to do this are called LL(1) grammars.
First we introduce an end marker $, for a given G = (V, , S, P ) we define the
augmented grammar G$ = (V 0 , 0 , S 0 , P 0 ) where
/ V ,
V 0 = V {S 0 } where S 0 is chosen s.t. S 0
0 = {$} where $ is chosen s.t. $
/ V ,
P 0 = P {S 0 S$}
The idea is that
L(G$ ) = {w$ | w L(G)}
Now for each nonterminal symbol A V 0 0 we define
First(A) = {a | a A a}
Follow(A) = {a | a S 0 Aa}
i.e. First(A) is the set of terminal symbols with which a word derived from A
may start and Follow(A) is the set of symbols which may occur directly after
A. We use the augmented grammar to have a marker for the end of the word.
For each production A P we define the set Lookahead(A ) which
are the set of symbols which indicate that we are in this alternative.
[
Lookahead(A B1 B2 . . . Bn ) = {First(Bi ) | 1 k < i.Bk }
Follow(A) if B1 B2 . . . Bk
otherwise
We now say a grammar G is LL(1), iff for each pair A , A P with
6= it is the case that Lookahead(A ) Lookahead(A ) =
7.2
7.3
7.4
try {
curr=st.nextToken().intern();
} catch( NoSuchElementException e) {
curr=null;
}
}
We also implement a convenience method error(String) to report an error
and terminate the program.
Now we can translate all productions into methods using the Lookahead sets to
determine which alternative to choose. E.g. we translate
E 0 +T E 0 |
into (using E1 for E 0 to follow JAVA rules):
static void parseE1() {
if (curr=="+") {
next();
parseT();
parseE1();
} else if(curr==")" || curr=="$" ) {
} else {
error("Unexpected :"+curr);
}
The basic idea is to
Translate each occurrence of a non terminal symbol into a test that this
symbol has been read and a call of next().
Translate each nonterminal symbol into a call of the method with the same
name.
If you have to decide between different productions use the lookahead sets
to determine which one to use.
If you find that there is no way to continue call error().
We initiate the parsing process by calling next() to read the first symbol and
then call parseE(). If after processing parseE() we are at the end marker,
then the parsing has been successful.
next();
parseE();
if(curr=="$") {
System.out.println("OK ");
} else {
error("End expected");
}
The complete parser can be found at
http://www.cs.nott.ac.uk/~txa/g51mal/ParseE0.java.
Actually, we can be a bit more realistic and turn the parser into a simple evaluator by
44
to translate the current token into a number. JAVA will raise an exception
if this fails.
Calculate the value of the expression read. I.e. we have to change the
method interfaces:
static
static
static
static
static
int
int
int
int
int
parseE()
parseE1(int x)
parseT()
parseT1(int x)
parseF()
7.5
The idea behind parseE1 and parseT1 is to pass the result calculated
so far and leave it to the method to incorporate the missing part of the
expression. I.e. in the case of parseE1
static int parseE1(int x) {
if (curr=="+") {
next();
int y = parseT();
return parseE1(x+y);
} else if(curr==")" || curr=="$" ) {
return x;
} else {
error("Unexpected :"+curr);
return x;
}
}
46
45
8.1
Q {stop} Q {L, R}
The transition function defines how the function behaves if is in state q
and the symbol on the tape is x. If (q, x) = stop then the machine stops
47
= {a, b, c}
48
= {X, Y, Z, }
is given by
(q0 , )
(q0 , a)
(q1 , a)
(q1 , Y )
(q1 , b)
(q2 , b)
(q2 , Z)
(q2 , c)
(q3 , )
(q3 , c)
(q4 , Z)
(q4 , b)
(q4 , Y )
(q4 , a)
(q4 , X)
(q5 , Z)
(q5 , Y )
(q5 , X)
(q, x)
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
( , q6 , R)
(X, q1 , R)
(a, q1 , R)
(Y, q1 , R)
(Y, q2 , R)
(b, q2 , R)
(Z, q2 , R)
(Z, q3 , R)
( , q5 , L)
(c, q4 , L)
(Z, q4 , L)
(b, q4 , L)
(Y, q4 , L)
(a, q4 , L)
(X, q0 , R)
(Z, q5 , L)
(Y, q4 , L)
(X, q6 , R)
stop
everywhere else
q0
a,X,R
q1
b,Y,R
a,a,R
Y,Y,R
X,X,R
c,Z,R
q2
q3
b,b,R
Z,Z,R
q4
, ,L
q5
c,c,L
Z,Z,L
Y,Y,L
X,X,R
Z,Z,L
b,b,L
Y,Y,L
a,a,L
q6
q0 = q 0
B=
F = {q6 }
The machine replaces an a by X (q0 ) and then looks for the first b replaces it
by Y (q1 ) and looks for the first c and replaces it by a Z (q2 ). If there are more
cs left it moves left to the next a (q4 ) and repeats the cycle. Otherwise it checks
whether there are no as and bs left (q5 ) and if so goes in an accepting state (q6 ).
Graphically the machine can be represented by the following transition diagram,
where the edges are labelled by (read-symbol,write-symbol,move-direction):
We see that M accepts aabbcc. Since M never loops it does actually decide L.
8.2
49
P = {S aSBC
S aBC
aB ab
CB BC
bB bb
bC bc
cC cc}
Lets fix a simple alphabet = {0, 1}. As computer scientist we are well aware
that everything can be coded up in bits and hence we accept that there is an
encoding of TMs in binary. I.e. given a TM M we write dM e {0, 1} for its
binary encoding. We assume that the encoding contains its length s.t. we know
when subsequent input on the tape starts.
Now we define the following language
Lhalt = {dM ew | M holds on input w.}
It is easy (although the details are quite daunting) to define a TM which accepts
this language: we just simulate M and accept if M stops.
However, Turing showed that there is no TM which decides this language. To
see this let us assume that there is a TM H which decides L. Now using H we
construct a new TM F which is a bit obnoxious: F on input x runs H on xx.
If H says yes then F goes into a loop otherwise (H says no) F stops.
The question is what happens if I run F on dF e? Let us assume it terminates,
then H applied to dF edF e returns yes and hence we must conclude that F on
dF e loops??? On the other hand if F with input dF e loops then H applied to
dF edF e will stop and reject and hence we have to conclude that F on dF e will
stop?????
This is a contradiction and hence we must conclude that our assumption that
there is a TM H which decides Lhalt is false. We say Lhalt is undecidable.
We haven shown that a Turing machine cannot decide whether a program (for
a Turing machine) halts. Maybe we could find a more powerful programming
language which overcomes this problem? It turns out that all computational
formalisms (i.e. programming languages) which can actually be implemented
are equal in power and can be simulated by each other this observation
is called the Church-Turing thesis because it was first formulated by Alonzo
Church and Alan Turing in the 30ies.
8.4
Back to Chomsky
At the end of the course we should have another look at the Chomsky hierarchy,
which classifies languages based on sublasses of grammars or equivalently by
different types of automata which recognize them
8.3
Turing showed that there are languages which are accepted by a TM (i.e. type 0
languages) but which are undecidable. The technical details of this construction
are quite involved but the basic idea is quite simple and is closely related to
Russells paradox, which we have seen in MCS.
51
52
All languages
Type 0 or recursively enumerable languages
Decidable languages
Turing machines
Type 1 or context sensitive
languages
Type 2 or context free
languages
pushdown automata
Type 3 or
regular languages
finite automata
References
[Alt01]
[Bac02]
[GJSB00] James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java
Language Specification. Sun Microsystems, Inc., 2nd edition edition,
2000. 4, 32
[HMU01] John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison
Wesley, 2nd edition edition, 2001. 3, 10, 25, 39, 48
[Hud99]
[Mar01]
Simon Marlow. Happy - the parser generator for haskell. www, 2001.
46
[Sch99]
Uwe Schoning. Theoretische Informatik kurzgefat. Spektrum Akademischer Verlag, 3. Auflage edition, 1999. 3
We have worked our way from the bottom to the top of the hierarchy: starting with finite automata, i.e. computation with fixed amount of memory via
pushdown automata (finite automata with a stack) to Turing machines (finite
automata with a tape). Correspondigly we have introduced different grammatical formalisms: regular expressions, context-free grammars and grammars.
Note that at each level there are languages which are on the next level but not
on the previous: {an bn | n N} is level 2 but not level 3; {an bn cn } is level 1
but not level 2 and the Halting problem is level 0 but not level 1.
We could have gone the other way: starting with Turing machines and grammars
and then introduce restrictions on them. I.e. Turing machines which only use
their tapes as a stack, and Turing machines which never use the tape apart
for reading the input. Again correspondingly we can define restrictions on the
grammar sise: first introduce context-free grammars and then grammars where
all productions are of the form A aB or A a, with A, B non-terminal
symbols and a, b are terminals. These grammars correspond precisely to regular
expressions (I leave this as an exercise).
I believe that Chomsky introduced his herarchy as a classification of grammars
and that the relation to automata was only observed a bit later. This is maybe
the reason why he introduced the Type-1 level, which is not so interesting from
an automata point of view (unless you are into computational complexity, i.e.
resource use - here linear use of memory). It is also the reason why on the other
hand the decidable languages do not constitute a level: there is no corresponding
grammatical formalism (we can even prove this).
54
53