Abstract
The information that can be inferred or predicted from knowing the genomic sequence of an organism is astonishing. String algorithms are critical to this process. This paper provides an overview of two particular problems that arise during computational molecular biology research, and recent algorithmic developments in solving them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Balasubramanian, S., Klenerman, D., Barnes, C., Osborne, M.: Patent US20077232656 (2007)
Bassino, F., Clément, J., Fayolle, J., Nicodème, P.: Constructions for clumps statistics. CoRR abs/0804.3671 (2008). http://arxiv.org/abs/0804.3671
Batzoglou, S.: Algorithmic challenges in mammalian genome sequence assembly. In: Dunn, M., Jorde, L., Little, P., Subramaniam, S. (eds.) Encyclopedia of Genomics, Proteomics and Bioinformatics. Wiley, Hoboken (New Jersey) (2005)
Boeva, V., Clément, J., Régnier, M., Vandenbogaert, M.: Assessing the significance of sets of words. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 358–370. Springer, Heidelberg (2005)
Brankovic, L., Iliopoulos, C.S., Kundu, R., Mohamed, M., Pissis, S.P., Vayani, F.: Linear-time superbubble identification algorithm for genome assembly. Theor. Comput. Sci. 609(Part 2), 374–383 (2016). http://www.sciencedirect.com/science/article/pii/S0304397515009147
de Bruijn, N.G.: A combinatorial problem. Koninklijke Nederlandse Akademie v. Wetenschappen 49, 758–764 (1946)
Butler, J., MacCallum, I., Kleber, M., Shlyakhter, I.A., Belmonte, M.K., Lander, E.S., Nusbaum, C., Jaffe, D.B.: ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res. 18(5), 810–820 (2008)
Compeau, P.: Bioinformatics Algorithms: An Active Learning Approach. Active Learning Publishers, La Jolla (2014)
Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings, p. 392. Cambridge University Press, Cambridge (2007)
Ehlers, T., Manea, F., Mercaş, R., Nowotka, D.: \(k\)-abelian pattern matching. J. Discrete Algorithms 34, 37–48 (2015)
Fischer, J.: Inducing the LCP-array. In: Dehne, F., Iacono, J., Sack, J.-R. (eds.) WADS 2011. LNCS, vol. 6844, pp. 374–385. Springer, Heidelberg (2011)
Gao, F., Zhang, C.T.: Ori-finder: a web-based system for finding orics in unannotated bacterial genomes. BMC Bioinform. 9(1), 79 (2008)
Grossi, R., Iliopoulos, C.S., Mercaş, R., Pisanti, N., Pissis, S.P., Retha, A., Vayani, F.: Circular sequence comparison with \(q\)-grams. In: Pop, M., Touzet, H. (eds.) WABI 2015. LNCS, vol. 9289, pp. 203–216. Springer, Heidelberg (2015)
Kvietikova, I., Wenger, R.H., Marti, H.H., Gassmann, M.: The transcription factors ATF-1 and CREB-1 bind constitutively to the hypoxia-inducible factor-1 (HIF-1) DNA recognition site. Nucleic Acids Res. 23(22), 4542–4550 (1995)
Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al.: Initial sequencing and analysis of the human genome. Nature 409(6822), 860–921 (2001)
Leonard, A.C., Grimwade, J.E.: Building a bacterial orisome: emergence of new regulatory features for replication origin unwinding. Mol. Microbiol. 55(4), 978–985 (2005)
Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)
Nurk, S., Bankevich, A., Antipov, D., Gurevich, A.A., Korobeynikov, A., Lapidus, A., Prjibelski, A.D., Pyshkin, A., Sirotkin, A., Sirotkin, Y., Stepanauskas, R., Clingenpeel, S.R., Woyke, T., McLean, J.S., Lasken, R., Tesler, G., Alekseyev, M.A., Pevzner, P.A.: Assembling single-cell genomes and mini-metagenomes from chimeric MDA products. J. Comput. Biol. 20(10), 714–737 (2013)
Onodera, T., Sadakane, K., Shibuya, T.: Detecting superbubbles in assembly graphs. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 338–348. Springer, Heidelberg (2013)
Pevzner, P.A., Tang, H., Waterman, M.S.: An Eulerian path approach to DNA fragment assembly. Proc. Nat. Acad. Sci. U.S.A. 98(17), 9748–9753 (2001)
Rahman, M.S., Iliopoulos, C.S.: Pattern matching algorithms with don’t cares. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plasil, F., Bielikova, M. (eds.) Proceedings of the 33rd International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2007), pp. 116–126. Institute of Computer Science AS CR, Prague (2007)
Régnier, M.: A unified approach to word statistics. In: Proceedings of the Second Annual International Conference on Computational Molecular Biology, RECOMB 1998, pp. 207–213. ACM, New York (1998). http://acm.org/10.1145/279069.279116
Sung, W., Sadakane, K., Shibuya, T., Belorkar, A., Pyrogova, I.: An \(O(m \log m)\)-time algorithm for detecting superbubbles. IEEE/ACM Trans. Comput. Biology Bioinform. 12(4), 770–777 (2015)
Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al.: The sequence of the human genome. Science 291(5507), 1304–1351 (2001)
Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18(5), 821–829 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Iliopoulos, C.S., Kundu, R., Mohamed, M., Vayani, F. (2016). Popping Superbubbles and Discovering Clumps: Recent Developments in Biological Sequence Analysis. In: Kaykobad, M., Petreschi, R. (eds) WALCOM: Algorithms and Computation. WALCOM 2016. Lecture Notes in Computer Science(), vol 9627. Springer, Cham. https://doi.org/10.1007/978-3-319-30139-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-30139-6_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30138-9
Online ISBN: 978-3-319-30139-6
eBook Packages: Computer ScienceComputer Science (R0)