Untitleddocument
Untitleddocument
Untitleddocument
1) NFA generation. Compile the expression into an NFA, using transitions with the epsilon
symbol e. (Book 3.7.4). This translation is compositional: simply traverse the syntax tree of the
regular expression. The size of the automaton is linear in the size of the expression: O(|r|).
But NFAs are, in comparison to DFAs, - more difficult to implement - less efficient to run: O(|r| *
n), where n is the input length.
(http://www.cse.chalmers.se/edu/course/TIN321/lectures/proglang-05.html)
In contrast, Thompson's algorithm maintains state lists of length approximately n and processes
the string, also of length n, for a total of O(n2) time. (The run time is superlinear, because we
are not keeping the regular expression constant as the input grows. For a regular expression of
length m run on text of length n, the Thompson NFA requires O(mn) time.)
A Nondeterministic Finite-state Automaton (NFA) is a tuple (Q,S,q0,F,d) where
http://www.cse.chalmers.se/edu/course/TIN321/lectures/proglang-05.html
Finite-state automata
A Deterministic Finite-state Automaton (DFA) is like NFA. except that the value of the transition
function d is a single state in Q (instead of a subset of Q) and there are no epsilon transitions
(transitions that don't consume a symbol). In other words,
It is customary to draw diagrams with circles for states and arrows for the transition function:
you can find them in the course book, and we will draw them on the blackboard.
1) NFA generation. Compile the expression into an NFA, using transitions with the epsilon
symbol e. (Book 3.7.4). This translation is compositional: simply traverse the syntax tree of the
regular expression. The size of the automaton is linear in the size of the expression: O(|r|).
But NFAs are, in comparison to DFAs, - more difficult to implement - less efficient to run: O(|r| *
n), where n is the input length.
2) Determinization. Transform the NFA into a DFA. (Book 3.7.1). Here you go through states,
and for each symbol create a state collecting those states that the symbol can lead to (subset
construction).
The DFA can still be bigger than necessary, i.e. have more states than needed.
3) Minimization. Merge equivalent states (Book 3.9.6): states beginning from which the
automaton recognizes the same set of strings.
The resulting DFA permits analysis that is linear in the length of the input string: O (n). But its
size is still exponential, at worst, in the length of the regular expression: O (2^|r|).