My-Hill Nerode Theorem
My-Hill Nerode Theorem
My-Hill Nerode Theorem
7.1 Word equivalence. Consider a language L over an alphabet . Two words x, y are L-equivalent, written x L y, i for all words z , we have xz L i yz L. For example, if = {0, 1} and A = 0, then A has four equivalence classes: A1 A2 A3 A4 = 00 = 01 = 10 0 = 11 1
If = {0, 1} and B = {0n 1n : n 0}, then B has innitely many equivalence classes: B = {0n 1m : m > n 0} 1 0 B0 = {0n 1n : n 0} B1 = {0n 1n1 : n 1} B2 = {0n 1n2 : n 2} B3 = {0n 1n3 : n 3} . . . The number of equivalence classes of L is called the index of the language L. In particular, the index of A is 4, and the index of B is . 7.2 Word equivalence is a right congruence. The equivalence L has the following two properties: (1) If x L y and x L, then y L. (2) If x L y, then xa L ya. They are immedite consequences of the denition. 7.2 Myhill-Nerode theorem, part 1. The equivalence relation L characterizes exactly what the state of an automaton that denes L needs to remember about the read portion of the input: if the read portion of the input is x, then the state needs to remember the equivalence class [x] L . This is sucient, because if x L y, then it does not matter if the read portion of the input was x or y; all that matters for deciding whether to accept or reject is the future portion of the input, say z, because xz L i yz L. It is also necessary, because if x L y, then there is some possibe future portion z of the input such that xz needs to be accepted and yz rejected (or vice versa). This is formalized by the following two theorems. Theorem 7A. If the index of a language A is k, then there is a k-state DFA MA such that L(MA ) = A. To see this, let A be a language with index k. (Q, , , q0 , F ), whose states the A -equivalence classes: Dene the following nite automaton MA =
Q = {[x]A : x }. q (p, a) i there exists a word x p such that xa q. q0 = []A . q F i there exists a word x q such that x A. From (2) it follows that MA is deterministic. From (1) it follows that for all states q Q, either q A = or q A. Therefore, for every word x , after reading input x, the DFA MA ends up in the state [x]A , which accepts if x A and rejects if x A. For A = 0:
1 A1 A2
A3
A4
It can be shown that for every DFA D, the quotient automaton D/ is equal (up to renaming of states) to the DFA ML(D) . 7.3 Myhill-Nerode theorem, part 2. In the previous section we saw that every language with nite index is regular; now we will see that every regular language has a nite index. Theorem 7B. For every k-state DFA M , the index of L(M ) is at most k. To see this, consider a DFA M over the alphabet . For two words x, y , let x M y i after reading input x, the DFA M ends up in the same state as after reading input y. The number of M -equivalence classes is equal to the number of states of M , say k. If x M y, then x L(M ) y, because inputs xz and yz always lead to the same state. Hence there can be no more L(M ) -equivalence classes than there are M -equivalence classes; that is, the index of L(M ) is at most k. Theorem 7B can be used to show that a language is nonregular: it suces to argue that its index is . Theorem 7B can be strengthened to show that if M is reduced (i.e., M = M/ ), then the index of L(M ) is exactly k. From that and Section 7.1 it follows that for every regular language L there is a unique (up to renaming of states) minimal DFA (with the fewest number of states), and that this DFA can be obtained by applying the quotient construction to any DFA for L. 7.4 Nondeterminism and minimality. There may not be a unique minimal NFA for a regular language. For example, the following three automata all dene the same language, namely 0, and are all reduced:
0,1
0,1
0,1
0,1
7.5 Bisimilarity. For DFAs, if two states p and q are equivalent, then for all input letters a, also (p, a) and (q, a) are equivalent. By contrast, for NFAs, two states p and q may be equivalent even though there are no equivalent states in (p, a) and (q, a). For instance, for the following NFA N , states A and B are equivalent even though no two of C, D, and E are equivalent:
0,1
L(NA ) = L(NB ) = L(NG ) = {00, 01} L(NC ) = {0} L(ND ) = {1} L(NE ) = {0, 1} L(NF ) = {} For NFAs we can dene a stronger concept than state equivalence called state bisimilarity, which corresponds to recursive state equivalence: two states p and q of an NFA M = (Q, , , q0 , F ) are bisimilar, written p B q, i the following three conditions are satised: M (1) p M q (this can be weakened to p 0 q). M (2) For all a and all p (p, a), there exists a state q (q, a) such that p B q . M (3) For all a and all q (q, a), there exists a state p (p, a) such that p B q . M For the example automaton N , we have
A
B N
but
B G. N
7.6 Minimization of nondeterministic automata. When applied to NFAs, the minimization algorithm (suitably adjusted) computes state bisimilarity, not state equivalence. Recall the NFA N from Section 7.5:
A B C D E F G
2 1 1 1 0 2
A
1 1 1 0
B
1 1 0 1
C
1 0 1
D
0 1
E
0
F G