KMP Algorithm
KMP Algorithm
Description
The design of the Knuth-Morris-Pratt algorithm follows a tight analysis of the
Morris and Pratt algorithm. Let us look more closely at the Morris-Pratt
algorithm. It is possible to improve the length of the shifts.
This introduces the notation: let kmpNext[i] be the length of the longest border
of x[0 .. i-1] followed by a character c different from x[i] and -1 if no such
tagged border exits, for 0 < i m. Then, after a shift, the comparisons can
resume between characters x[kmpNext[i]] and y[i+j] without missing any
occurrence of x in y, and avoiding a backtrack on the text (see figure 7.1). The
value of kmpNext[0] is set to -1.
The table kmpNext can be computed in O(m) space and time before the
searching phase, applying the same searching algorithm to the pattern itself,
as if x=y.
The searching phase can be performed in O(m+n) time. The Knuth-Morris-
Pratt algorithm performs at most 2n-1 text character comparisons during the
searching phase. The delay (maximal number of comparisons for a single text
The C code
void preKmp(char *x, int m, int kmpNext[]) {
int i, j;
i = 0;
j = kmpNext[0] = -1;
while (i < m) {
while (j > -1 && x[i] != x[j])
j = kmpNext[j];
i++;
j++;
if (x[i] == x[j])
kmpNext[i] = kmpNext[j];
else
kmpNext[i] = j;
}
}
/* Preprocessing */
preKmp(x, m, kmpNext);
/* Searching */
i = j = 0;
while (j < n) {
while (i > -1 && x[i] != y[j])
i = kmpNext[i];
i++;
j++;
if (i >= m) {
OUTPUT(j - i);
i = kmpNext[i];
}
}
}
The example
Preprocessing phase