Abstract
Simultaneous alignment and secondary structure prediction of RNA sequences is often referred to as “RNA structural alignment.” A class of the methods for structural alignment is based on the principles proposed by Sankoff more than 25 years ago. The Sankoff algorithm simultaneously folds and aligns two or more sequences. The advantage of this algorithm over those that separate the folding and alignment steps is that it makes better predictions. The disadvantage is that it is slower and requires more computer memory to run. The amount of computational resources needed to run the Sankoff algorithm is so high that it took more than a decade before the first implementation of a Sankoff style algorithm was published. However, with the faster computers available today and the improved heuristics used in the implementations the Sankoff-based methods have become practical. This chapter describes the methods based on the Sankoff algorithm. All the practical implementations of the algorithm use heuristics to make them run in reasonable time and memory. These heuristics are also described in this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gardner PP, Wilm A, Washietl S (2005) A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Res 33(8):2433–2439
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947–2948
Washietl S, Hofacker IL (2004) Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. J Mol Biol 342(1): 19–30
Menzel P, Gorodkin J, Stadler PF (2009) The tedious task of finding homologous noncoding RNA genes. RNA 15(12):2075–2082
Sankoff D (1985) Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J Appl Math 45(5): 810–825
Klein RJ, Eddy SR (2003) RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics 4(1):44
Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89(22):10915–10919
Higgins DG, Sharp PM (1988) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73(1):237–244
Hofacker IL, Bernhart SH, Stadler PF (2004) Alignment of RNA base pairing probability matrices. Bioinformatics 20(14):2222–2227
Bradley RK, Pachter L, Holmes I (2008) Specific alignment of structured RNA: stochastic grammars and sequence annealing. Bioinformatics 24(23):2677–2683
Gorodkin J, Heyer LJ, Stormo GD (1997) Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucleic Acids Res 25(18):3724–3732
Gorodkin J, Stricklin SL, Stormo GD (2001) Discovering common stem-loop motifs in unaligned RNA sequences. Nucleic Acids Res 29(10):2135–2144
Havgaard JH, Lyngso RB, Stormo GD, Gorodkin J (2005) Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9): 1815–1824
Havgaard JH, Torarinsson E, Gorodkin J (2007) Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Comput Biol 3(10):1896–1908
Mathews DH, Turner DH (2002) Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J Mol Biol 317(2):191–203
Mathews D (2004) Predicting the secondary structure common to two RNA sequences with Dynalign. Curr Protoc Bioinformatics. Unit 12.4
Mathews DH (2005) Predicting a set of minimal free energy RNA secondary structures common to two sequences. Bioinformatics 21(10):2246–2253
Harmanci AO, Sharma G, Mathews DH (2007) Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign. BMC Bioinformatics 8:130
Will S, Reiche K, Hofacker IL, Stadler PF, Backofen R (2007) Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Comput Biol 3(4):e65
Will S, Joshi T, Hofacker IL, Stadler PF, Backofen R (2012) LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA 18(5):900–914
Gorodkin J, Hofacker IL, Torarinsson E, Yao Z, Havgaard JH, Ruzzo WL (2010) De novo prediction of structured RNAs from genomic sequences. Trends Biotechnol 28(1):9–19
Kiryu H, Tabei Y, Kin T, Asai K (2007) Murlet: a practical multiple alignment tool for structural RNA sequences. Bioinformatics 23(13):1588–1598
Torarinsson E, Havgaard JH, Gorodkin J (2007) Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23(8):926–932
Dowell RD, Eddy SR (2004) Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinformatics 5(1):71
Rivas E, Lang R, Eddy SR (2012) A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more. RNA 18(2):193–212
Dowell RD, Eddy SR (2006) Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics 7:400
Holmes I (2005) Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics 6:73
Torarinsson E, Sawera M, Havgaard JH, Fredholm M, Gorodkin J (2006) Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure. Genome Res 16(7):885–889
Uzilov AV, Keegan JM, Mathews DH (2006) Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics 7(1):173
Torarinsson E, Lindgreen S (2008) WAR: Webserver for aligning structural RNAs. Nucleic Acids Res 36(Web server issue):W79–W84
Gorodkin J, Hofacker IL (2011) From structure prediction to genomic screens for novel non-coding RNAs. PLoS Comput Biol 7(8):e1002100
Meyer IM, Mikls I (2007) SimulFold: simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. PLoS Comput Biol 3(8):e149
Menzel P, Seemann SE, Gorodkin J (2012) RILogo: visualising RNA-RNA interactions. Bioinformatics 28(19):2523–2526
Gardner PP, Daub J, Tate J, Moore BL, Osuch IH, Griffiths-Jones S, Finn RD, Nawrocki EP, Kolbe DL, Eddy SR, Bateman A (2011) Rfam: Wikipedia, clans and the “decimal” release. Nucleic Acids Res 39(Database issue):D141–D145
Widmann J, Stombaugh J, McDonald D, Chocholousova J, Gardner P, Iyer MK, Liu Z, Lozupone CA, Quinn J, Smit S, Wikman S, Zaneveld JR, Knight R (2012) RNASTAR: an RNA STructural Alignment Repository that provides insight into the evolution of natural and artificial RNAs. RNA 18(7):1319– 1327
Breaker RR (2011) Prospects for riboswitch discovery and analysis. Mol Cell 43(6):867–879
Ding Y, Lawrence CE (2003) A statistical sampling algorithm for RNA secondary structure prediction Nucleic Acids Res 31(24):7280–7301
Voss B (2006) Structural analysis of aligned RNAs. Nucleic Acids Res 34(19):5471– 5481
Harmanci AO, Sharma G, Mathews DH (2009) Stochastic sampling of the RNA structural alignment space. Nucleic Acids Res 37(12):4063–4075
Höner zu Siederdissen C, Bernhart SH, Stadler PF, Hofacker IL (2011) A folding algorithm for extended RNA secondary structures. Bioinformatics 27(13):i129– i136
Washietl S, Hofacker IL, Stadler PF, Kellis M (2012) RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction. Nucleic Acids Res 40(10):4261–4272
Acknowledgements
This work is supported by the Danish Council for Independent Research (Technology and Production Sciences), the Danish Council for Strategic Research (Programme Commission on Strategic Growth Technologies), as well as the Danish Center for Scientific Computing.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Havgaard, J.H., Gorodkin, J. (2014). RNA Structural Alignments, Part I: Sankoff-Based Approaches for Structural Alignments. In: Gorodkin, J., Ruzzo, W. (eds) RNA Sequence, Structure, and Function: Computational and Bioinformatic Methods. Methods in Molecular Biology, vol 1097. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-709-9_13
Download citation
DOI: https://doi.org/10.1007/978-1-62703-709-9_13
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-708-2
Online ISBN: 978-1-62703-709-9
eBook Packages: Springer Protocols