Abstract
Motivated by the problem in computational biology of reconstructing the series of chromosome inversions by which one organism evolved from another, we consider the problem of computing the shortest series of reversals that transform one permutation to another. The permutations describe the order of genes on corresponding chromosomes, and a reversal takes an arbitrary substring of elements and reverses their order.
For this problem we develop two algorithms: a greedy approximation algorithm that finds a solution provably close to optimal in O(n2) time and O(n) space for an n element permutation, and a branch and bound exact algorithm that finds an optimal solution in O(mL(n, n)) time and O(n 2) space, where m is the size of the branch and bound search tree and L(n, n) is the time to solve a linear program of n variables and n constraints. The greedy algorithm is the first to come within a constant factor of the optimum, and guarantees a solution that uses no more than twice the minimum number of reversals. The lower and upper bounds of the branch and bound algorithm are a novel application of maximum weight matchings, shortest paths, and linear programming.
In a series of experiments we study the performance of an implementation. For random permutations we find that the average difference between the upper and lower bounds is less than 3 reversals for n≤50. Due to the tightness of these bounds we can solve to optimality random permutations on 30 elements in a few minutes of computer time.
Preview
Unable to display preview. Download preview PDF.
References
Aigner, Martin and Douglas B. West. Sorting by insertion of leading elements. Journal of Combinatorial Theory (Series A) 45, 306–309, 1987.
Amato, Nancy, Manuel Blum, Sandra Irani, and Ronitt Rubinfeld. Reversing trains: a turn of the century sorting problem. Journal of Algorithms 10, 413–428, 1989.
Bibb, J.J., R.A. van Etten, C.T. Wright, M.W. Walberg, and D.A. Clayton. Cell 26, 167–180, 1981.
Dobzhansky, Theodosius. Genetics of the Evolutionary Process. Columbia Univeristy Press, 1970.
Even, S. and O. Goldreich. The minimum-length generator sequence problem is NP-hard. Journal of Algorithms 2, 311–313, 1981.
Garey, Michael R. and David S. Johnson. Computers and Intractability: A Guide to The Theory of NP-Completeness. W.H. Freeman, New York, 1979.
Gates, William H. and Christos H. Papadimitriou. Bounds for sorting by prefix reversal. Discrete Mathematics 27, 47–57, 1979.
Golan, Holger. Personal communication, 1991.
Jerrum, Mark R. The complexity of finding minimum-length generator sequences. Theoretical Computer Science 36, 265–289, 1985.
Kececioglu, John and David Sankoff. Exact and approximation algorithms for the inversion distance between two chromosomes. To appear in Algorithmica.
Micali, S. and V. Vazirani. An O(√¦V¦·¦E¦) algorithm for finding maximum matchings in general graphs. In Proceedings of the 21st Symposium on Foundations of Computer Science, 17–27, 1980.
Nadeau, J.H. and B.A. Taylor. Lengths of chromosomal segments conserved since divergence of man and mouse. Proceedings of the National Academy of Sciences USA 81, 814, 1984.
O'Brien, S.J., editor. Genetic Maps. Cold Spring Harbor Laboratory, 1987.
Palmer, J.D., B. Osorio, and W.F. Thompson. Evolutionary significance of inversions in legume chloroplast DNAs. Current Genetics 14, 65–74, 1988.
Sankoff, David, Guillame Leduc, Natalie Antoine, Bruno Paquin, B. Franz Lang, and Robert Cedergren. Gene order comparisons for phylogenetic inference: evolution of the mitochondrial genome. Proceedings of the National Academy of Sciences USA 89, 6575–6579, 1992.
Schöniger, Michael and Michael S. Waterman. A local algorithm for DNA sequence alignment with inversions. Bulletin of Mathematical Biology 54, 521–536, 1992.
Sessions, Stanley K. Chromosomes: molecular cytogenetics. In Molecular Systematics, David M. Hillis and Craig Moritz editors, Sinauer, Sunderland, Massachusetts, 156–204, 1990.
Tichy, Walter F. The string-to-string correction problem with block moves. ACM Transactions on Computer Systems 2:4, 309–321, 1984.
Wagner, Robert A. On the complexity of the extended string-to-string correction problem. In Time Warps, String Edits, and Macromolecules: The Theory and Prac tice of Sequence Comparison, David Sankoff and Joseph B. Kruskal, editors, Addison-Wesley, Reading Massachusetts, 215–235, 1983.
Watterson, G.A., W.J. Ewens, T.E. Hall, and A. Morgan. The chromosome inversion problem. Journal of Theoretical Biology 99, 1–7, 1982.
Wolstenholme, D.R., J.L. MacFarlane, R. Okimoto, D.O. Clary, and J.A. Wahleithner. Proceedings of the National Academy of Sciences USA 84, 1324–1328, 1987.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kececioglu, J., Sankoff, D. (1993). Exact and approximation algorithms for the inversion distance between two chromosomes. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1993. Lecture Notes in Computer Science, vol 684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0029799
Download citation
DOI: https://doi.org/10.1007/BFb0029799
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56764-6
Online ISBN: 978-3-540-47732-7
eBook Packages: Springer Book Archive