Abstract
The Local Alignment problem is a classical problem with applications in biology. Given two input strings and a scoring function on pairs of letters, one is asked to find the substrings of the two input strings that are most similar under the scoring function. The best algorithms for Local Alignment run in time that is roughly quadratic in the string length. It is a big open problem whether substantially subquadratic algorithms exist. In this paper we show that for all ε > 0, an O(n 2 − ε) time algorithm for Local Alignment on strings of length n would imply breakthroughs on three longstanding open problems: it would imply that for some δ > 0, 3SUM on n numbers is in O(n 2 − δ) time, CNF-SAT on n variables is in O((2 − δ)n) time, and Max Weight 4-Clique is in O(n 4 − δ) time. Our result for CNF-SAT also applies to the easier problem of finding the longest common substring of binary strings with don’t cares. We also give strong conditional lower bounds for the more general Multiple Local Alignment problem on k strings, under both k-wise and SP scoring, and for other string similarity problems such as Global Alignment with gap penalties and normalized Longest Common Subsequence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abboud, A., Lewi, K., Williams, R.: On the parameterized complexity of k-sum. CoRR, abs/1311.3054 (2013)
Abboud, A., Vassilevska Williams, V.: Popular conjectures imply strong lower bounds for dynamic problems. arXiv, arXiv:1402.0054 (2014)
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. Journal of Molecular Biology 215(3), 403–410 (1990)
Arslan, A.N., Eğecioğlu, Ö., Pevzner, P.A.: A new approach to sequence comparison: Normalized sequence alignment. Bioinformatics 17(4), 327–337 (2001)
David, J.: Bacon and Wayne F. Anderson. Multiple sequence alignment. Journal of Molecular Biology 191(2), 153–161 (1986)
Baran, I., Demaine, E.D., Pǎtraşcu, M.: Subquadratic algorithms for 3SUM. Algorithmica 50(4), 584–596 (2008); In: Dehne, F., López-Ortiz, A., Sack, J.-R. (eds.) WADS 2005. LNCS, vol. 3608, pp. 409–421. Springer, Heidelberg (2005)
Barequet, G., Har-Peled, S.: Some variants of polygonal containment and minimum hausdorff distance undertranslation are 3SUM-hard. In: SODA, pp. 862–863 (1999)
Bille, P., Farach-Colton, M.: Fast and compact regular expression matching. Theoretical Computer Science 409(3), 486–496 (2008)
Bille, P., Gørtz, I.L., Vildhøj, H.W., Vind, S.: String indexing for patterns with wildcards. In: Fomin, F.V., Kaski, P. (eds.) SWAT 2012. LNCS, vol. 7357, pp. 283–294. Springer, Heidelberg (2012)
Bodlaender, H.L., Downey, R.G., Fellows, M.R., Wareham, H.T.: The parameterized complexity of sequence alignment and consensus. Theoretical Computer Science 147(1), 31–54 (1995)
Calabro, C., Impagliazzo, R., Paturi, R.: The complexity of satisfiability of small depth circuits. In: Chen, J., Fomin, F.V. (eds.) IWPEC 2009. LNCS, vol. 5917, pp. 75–85. Springer, Heidelberg (2009)
Chen, K.-Y., Hsu, P.-H., Chao, K.-M.: Approximate matching for run-length encoded strings is 3sum-hard. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009 Lille. LNCS, vol. 5577, pp. 168–179. Springer, Heidelberg (2009)
Cheong, O., Efrat, A., Har-Peled, S.: On finding a guard that sees most and a shop that sells most. In: SODA, pp. 1098–1107 (2004)
Cole, R., Gottlieb, L.-A., Lewenstein, M.: Dictionary matching and indexing with errors and don’t cares. In: STOC, pp. 91–100 (2004)
Crochemore, M., Landau, G.M., Ziv-Ukelson, M.: A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM Journal on Computing 32, 1654–1673 (2003)
Cygan, M., Dell, H., Lokshtanov, D., Marx, D., Nederlof, J., Okamoto, Y., Paturi, R., Saurabh, S., Wahlstrom, M.: On problems as hard as CNFSAT. In: CCC, pp. 74–84 (2012)
Cygan, M., Kratsch, S., Nederlof, J.: Fast Hamiltonicity checking via bases of perfect matchings. In: STOC, pp. 301–310 (2013)
Dantsin, E., Wolpert, A.: On moderately exponential time for SAT. In: Strichman, O., Szeider, S. (eds.) SAT 2010. LNCS, vol. 6175, pp. 313–325. Springer, Heidelberg (2010)
de Berg, M., de Groot, M., Overmars, M.H.: Perfect binary space partitions. Computational Geometry: Theory and Applications 7(81), 81–91 (1997)
Dietzfelbinger, M.: Universal hashing and k-wise independent random variables via integer arithmetic without primes. In: Puech, C., Reischuk, R. (eds.) STACS 1996. LNCS, vol. 1046, pp. 569–580. Springer, Heidelberg (1996)
Efraty, N., Landau, G.M.: Sparse normalized local alignment. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 333–346. Springer, Heidelberg (2004)
Erickson, J.: New lower bounds for convex hull problems in odd dimensions. SIAM Journal on Computing 28(4), 1198–1214 (1999)
Erickson, J.: Bounds for linear satisfiability problems. Chicago J. Theor. Comput. Sci. (1999)
Fischer, M.J., Paterson, M.S.: String matching and other products. SIAM-AMS Proc. 7, 113–125 (1973)
Gajentaan, A., Overmars, M.: On a class of o(n 2) problems in computational geometry. Computational Geometry 5(3), 165–185 (1995)
Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162, 705–708 (1982)
Gusfield, D.: Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press (1997)
Hirsch, E.A.: Two new upper bounds for SAT. In: SODA, pp. 521–530 (1998)
Huang, X.: Parameterized complexity and polynomial-time approximation schemes. PhD thesis, Citeseer (2004)
Impagliazzo, R., Paturi, R.: On the complexity of k-sat. J. Comput. Syst. Sci. 62(2), 367–375 (2001)
Indyk, P.: Faster algorithms for string matching problems: Matching the convolution bound. In: FOCS, p. 166 (1998)
Jafargholi, Z., Viola, E.: 3sum, 3xor, triangles. Electronic Colloquium on Computational Complexity (ECCC) 20, 9 (2013)
Kalai, A.: Efficient pattern-matching with don’t cares. In: SODA, pp. 655–656 (2002)
Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton, J.C.: Detecting subtle sequence signals: a gibbs sampling strategy for multiple alignment. Science 262(5131), 208–214 (1993)
Lokshtanov, D., Marx, D., Saurabh, S.: Known algorithms on graphs on bounded treewidth are probably optimal. In: SODA, pp. 777–789 (2011)
Lopresti, D., Tomkins, A.: Block edit models for approximate string matching. Theoretical Computer Science 181, 159–179 (1997)
Erickson, J., Soss, M., Overmars, M.H.: Preprocessing chains for fast dihedral rotations is hard or even impossible. Computational Geometry: Theory and Applications 26(3), 235–246 (2002)
Masek, W.J., Paterson, M.S.: A faster algorithm computing string edit distances. Journal of Computer and System Sciences 20 (1980)
Nešetřil, J., Poljak, S.: On the complexity of the subgraph problem. Comment. Math. Univ. Carolin. 26(2), 415–419 (1985)
Pǎtraşcu, M.: Towards polynomial lower bounds for dynamic problems. In: STOC, pp. 603–610 (2010)
Paturi, R., Pudlák, P., Saks, M.E., Zane, F.: An improved exponential-time algorithm for k-SAT. J. ACM 52(3), 337–364 (2005)
Pietrzak, K.: On the parameterized complexity of the fixed alphabet shortest common supersequence and longest common subsequence problems. Journal of Computer and System Sciences 67(4), 757–771 (2003)
Pǎtraşcu, M., Williams, R.: On the possibility of faster SAT algorithms. In: SODA, pp. 1065–1075 (2010)
Rahman, M.S., Iliopoulos, C., Lee, I., Mohamed, M., Smyth, W.F.: Finding patterns with variable length gaps or don’t cares. In: Chen, D.Z., Lee, D.T. (eds.) COCOON 2006. LNCS, vol. 4112, pp. 146–155. Springer, Heidelberg (2006)
Roditty, L., Vassilevska Williams, V.: Fast approximation algorithms for the diameter and radius of sparse graphs. In: STOC, pp. 515–524 (2013)
Schöning, U.: A probabilistic algorithm for k-SAT and constraint satisfaction problems. In: FOCS, pp. 410–414 (1999)
Smith, T.F., Waterman, M.S.: The identification of common molecular subsequences. Journal of Molecular Biology 147, 195–197 (1981)
Vassilevska, V., Williams, R.: Finding, minimizing, and counting weighted subgraphs. In: STOC, pp. 455–464 (2009)
Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. Journal of Computational Biology 1(4), 337–348 (1994)
Williams, R.: A new algorithm for optimal constraint satisfaction and its implications. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 1227–1237. Springer, Heidelberg (2004)
Vassilevska Williams, V., Williams, R.: Subcubic equivalences between path, matrix and triangle problems. In: FOCS, pp. 645–654 (2010)
Woeginger, G.J.: Space and time complexity of exact algorithms: Some open problems. In: Downey, R.G., Fellows, M.R., Dehne, F. (eds.) IWPEC 2004. LNCS, vol. 3162, pp. 281–290. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Abboud, A., Williams, V.V., Weimann, O. (2014). Consequences of Faster Alignment of Sequences. In: Esparza, J., Fraigniaud, P., Husfeldt, T., Koutsoupias, E. (eds) Automata, Languages, and Programming. ICALP 2014. Lecture Notes in Computer Science, vol 8572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43948-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-662-43948-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43947-0
Online ISBN: 978-3-662-43948-7
eBook Packages: Computer ScienceComputer Science (R0)