Abstract
The Extended String-to-String Correction Problem [ESSCP] is defined as the problem of determining, for given strings A and B over alphabet V, a minimum-cost sequence S of edit operations such that S(A) = B. The sequence S may make use of the operations: Change, Insert, Delete and Swaps, each of constant cost WC, WI, WD, and WS respectively. Swap permits any pair of adjacent characters to be interchanged.
The principal results of this paper are:
(1) a brief presentation of an algorithm (the CELLAR algorithm) which solves ESSCP in time Ø(¦A¦* ¦B¦* ¦V¦s*s), where s = min(4WC, WI+WD)/WS + 1;
(2) presentation of polynomial time algorithms for the cases (a) WS = 0, (b) WS > 0, WC= WI= WD= @@@@;
(3) proof that ESSCP, with WI < WC = WD = @@@@, 0 < WS < @@@@, suitably encoded, is NP-complete. (The remaining case, WS= @@@@, reduces ESSCP to the string-to-string correction problem of [1], where an Ø( ¦A¦* ¦B¦) algorithm is given.) Thus, “almost all” ESSCP's can be solved in deterministic polynomial time, but the general problem is NP-complete.