New Algorithms for the Spaced Seeds

Gao, Xin; Li, Shuai Cheng; Lu, Yinan

doi:10.1007/978-3-540-73814-5_5

Xin Gao¹,
Shuai Cheng Li¹ &
Yinan Lu^1,2

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4613))

Included in the following conference series:

International Workshop on Frontiers in Algorithmics

621 Accesses

Abstract

The best known algorithm computes the sensitivity of a given spaced seed on a random region with running time O((M + L)|B|), where M is the length of the seed, L is the length of the random region, and |B| is the size of seed-compatible-suffix set, which is exponential to the number of 0’s in the seed. We developed two algorithms to improve this running time: the first one improves the running time to O(|B′|² ML), where B′ is a subset of B; the second one improves the running time to O((M|B|)^2.236 log(L/M)), which will be much smaller than the original running time when L is large. We also developed a Monte Carlo algorithm which can guarantee to quickly find a near optimal seed with high probability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Sparse Suffix Tree Construction in Small Space

Compressed Spaced Suffix Arrays

Article 02 February 2017

Local Search for String Problems: Brute Force Is Essentially Optimal

References

Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D.: Basic local alignment search tool. J.Mol.Biol. 215, 403–410 (1990)
Google Scholar
Altschul, S., Madden, T., Schäffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)
Article Google Scholar
Brejova, B., Brown, D., Vinar, T.: Optimal spaced seeds for hidden markov models, with application to homologous coding regions. In: CPM 2003. The 14th Annual Symposium on Combinatorial Pattern Matching, Washington, DC, USA, pp. 42–54. IEEE Computer Society Press, Los Alamitos (2003)
Google Scholar
Brown, D.: Optimizing multiple seeds for protein homology search. IEEE/ACM Trans. Comput. Biol. Bioinformatics 2(1), 29–38 (2005)
Article Google Scholar
Burkhardt, S., Crauser, A., Lenhof, H., Rivals, E., Ferragina, P., Vingron, M.: q-gram based databse searching using a suffix array. In: Third Annual International Conference on Computational Molecular Biology, pp. 11–14 (1999)
Google Scholar
Choi, K., Zeng, F., Zhang, L.: Good spaced seeds for homology search. Bioinformatics 20(7), 1053–1059 (2004)
Article Google Scholar
Choi, K., Zhang, L.: Sensitivity analysis and efficient method for identifying optimal spaced seeds. Journal of Computer and System Sciences 68, 22–40 (2004)
Article MATH Google Scholar
Delcher, A., Kasif, S., Fleischmann, R., Peterson, J., White, O., Salzberg, S.: Alignment of whole genomes. Nucleic Acids Res. 27, 2369–2376 (1999)
Article Google Scholar
Li, M., Ma, B., Kisman, D., Tromp, J.: Patternhunter ii: highly sensitive and fast homology search. JBCB 2(3), 417–439 (2004)
Google Scholar
Li, M., Ma, B., Zhang, L.: Superiority and complexity of the spaced seeds. In: SODA 2006. Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithms, pp. 444–453. ACM Press, New York (2006)
Chapter Google Scholar
Lipman, D., Pearson, W.: Rapid and sensitive protein similarity searches. Science 227, 1435–1441 (1985)
Article Google Scholar
Ma, B., Tromp, J., Li, M.: Patternhunter: faster and more sensitive homology search. Bioinformatics 18(3), 440–445 (2002)
Article Google Scholar
Preparata, F., Zhang, L., Choi, K.: Quick, practical selection of effective seeds for homology search. JCB 12(9), 1137–1152 (2005)
Google Scholar
Tatusova, T., Madden, T.: Blast 2 sequences - a new tool for comparing protein and nucleotide sequences. FEMS Microbiol. Lett. 174, 247–250 (1999)
Article Google Scholar
Yang, I., Wang, S., Chen, Y., Huang, P.: Efficient methods for generating optimal single and multiple spaced seeds. In: BIBE 2004. Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering, Washington, DC, USA, p. 411. IEEE Computer Society Press, Los Alamitos (2004)
Google Scholar
Zhang, Z., Schwartz, S., Wagner, L., Miller, W.: A greedy algorithm for aligning dna sequences. J.Comput.Biol. 7, 203–214 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario,N2L 6P7, Canada
Xin Gao, Shuai Cheng Li & Yinan Lu
College of Computer Science and Tecnology of Jilin University, 10 Qianwei Road, Changchun, Jilin Province,130012, China
Yinan Lu

Authors

Xin Gao
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Cheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Yinan Lu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Franco P. Preparata Qizhi Fang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, X., Li, S.C., Lu, Y. (2007). New Algorithms for the Spaced Seeds. In: Preparata, F.P., Fang, Q. (eds) Frontiers in Algorithmics. FAW 2007. Lecture Notes in Computer Science, vol 4613. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73814-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-73814-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73813-8
Online ISBN: 978-3-540-73814-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

New Algorithms for the Spaced Seeds

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Sparse Suffix Tree Construction in Small Space

Compressed Spaced Suffix Arrays

Local Search for String Problems: Brute Force Is Essentially Optimal

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

New Algorithms for the Spaced Seeds

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Sparse Suffix Tree Construction in Small Space

Compressed Spaced Suffix Arrays

Local Search for String Problems: Brute Force Is Essentially Optimal

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation