Abstract
Polymorphisms in DNA- or RNA-seq data lead to recognisable patterns in a de Bruijn graph representation of the reads obtained by sequencing. Such patterns have been called mouths, or bubbles in the literature. They correspond to two vertex-disjoint directed paths between a source s and a target t. Due to the high number of such bubbles that may be present in real data, their enumeration is a major issue concerning the efficiency of dedicated algorithms. We propose in this paper the first linear delay algorithm to enumerate all bubbles with a given source.
This work was supported by the french ANR MIRI BLAN08-1335497 Project and the ERC Advanced Grant Sisyphe held by Marie-France Sagot. Partially supported by Italian project PRIN AlgoDEEP (2008TFBWL4) of MIUR. The second author received additional support from the Italian PRIN project ’DISCO’.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Robertson, G., et al.: De novo assembly and analysis of RNA-seq data. Nature Methods 7(11), 909–912 (2010)
Sacomoto, G., et al.: KisSplice: de-novo calling alternative splicing events from rna-seq data. In: RECOMB-Seq, BMC Bioinformatics (2012)
Simpson, J.T., et al.: ABySS: A parallel assembler for short read sequence data. Genome Research 19(6), 1117–1123 (2009)
Peterlongo, P., Schnel, N., Pisanti, N., Sagot, M.-F., Lacroix, V.: Identifying SNPs without a Reference Genome by Comparing Raw Reads. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 147–158. Springer, Heidelberg (2010)
Gusfield, D., Eddhu, S., Langley, C.H.: Optimal, efficient reconstruction of phylogenetic networks with constrained recombination. J. Bioinf. and Comput. Biol. 2(1), 173–214 (2004)
Iqbal, Z., Caccamo, M., Turner, I., Flicek, P., McVean, G.: De novo assembly and genotyping of variants using colored de bruijn graphs. Nature Genetics (2012)
Johnson, D.B.: Finding all the elementary circuits of a directed graph. SIAM J. Comput. 4(1), 77–84 (1975)
Pevzner, P.A., Tang, H., Tesler, G.: De novo repeat classification and fragment assembly. In: RECOMB, pp. 213–222 (2004)
Sammeth, M.: Complete alternative splicing events are bubbles in splicing graphs. J. Comput. Biol. 16(8), 1117–1140 (2009)
Tarjan, R.E.: Enumeration of the elementary circuits of a directed graph. SIAM Journal on Computing 2(3), 211–216 (1973)
Tiernan, J.C.: An efficient search algorithm to find the elementary circuits of a graph. Commun. ACM 13(12), 722–726 (1970)
Zerbino, D.R., Birney, E.: Velvet: Algorithms for de novo short read assembly using de bruijn graphs. Genome Research 18(5), 821–829 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Birmelé, E. et al. (2012). Efficient Bubble Enumeration in Directed Graphs. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds) String Processing and Information Retrieval. SPIRE 2012. Lecture Notes in Computer Science, vol 7608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34109-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-34109-0_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34108-3
Online ISBN: 978-3-642-34109-0
eBook Packages: Computer ScienceComputer Science (R0)