Abstract
Many classes of functionally related RNA molecules show a rather weak sequence conservation but instead a fairly well conserved secondary structure. Hence, it is clear that any method that relates RNA sequences in form of (multiple) alignments should take structural features into account. Since multiple alignments are of great importance for subsequent data analysis, research in improving the speed and accuracy of such alignments benefits many other analysis problems.
We present a formulation for computing provably optimal, structure-based, multiple RNA alignments and give an algorithm that finds such an optimal (or near-optimal) solution. To solve the resulting computational problem we propose an algorithm based on Lagrangian relaxation which already proved successful in the two-sequence case. We compare our implementation, mLARA, to three programs (clustalW, MARNA, and pmmulti) and demonstrate that we can often compute multiple alignments with consensus structures that have a significant lower minimum free energy term than computed by the other programs. Our prototypical experiments show that our new algorithm is competitive and, in contrast to other methods, is applicable to long sequences where standard dynamic programming approaches must fail. Furthermore, the Lagrangian method is capable of handling arbitrary pseudoknot structures.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Althaus, E., Caprara, A., Lenhof, H.-P., Reinert, K.: Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics. Bioinformatics 18(90002), S4–S16 (2002)
Bafna, V., Muthukrishnan, S., Ravi, R.: Computing similarity between RNA strings. In: Galil, Z., Ukkonen, E. (eds.) CPM 1995. LNCS, vol. 937, pp. 1–16. Springer, Heidelberg (1995)
Bauer, M., Klau, G.W.: Structural Alignment of Two RNA Sequences with Lagrangian Relaxation. In: Fleischer, R., Trippen, G. (eds.) ISAAC 2004. LNCS, vol. 3341, pp. 113–123. Springer, Heidelberg (2004)
Caprara, A., Lancia, G.: Structural Alignment of Large-Size Proteins via Lagrangian Relaxation. In: Proc. of RECOMB 2002, pp. 100–108. ACM Press, New York (2002)
Eddy, S.P., Durbin, R.: RNA sequence analysis using covariance models. Nucl. Acids Research 22(11), 2079–2088 (1994)
Evans, P.: Finding common subsequences with arcs and pseudoknots. In: Crochemore, M., Paterson, M. (eds.) CPM 1999. LNCS, vol. 1645, pp. 270–280. Springer, Heidelberg (1999)
Gardner, P., Wilm, A., Washietl, S.: A benchmark of multiple sequence alignment programs upon structural RNAs. Nucl. Acids Res. 33(8), 2433–2439 (2005)
Gorodkin, J., Heyer, L.J., Stormo, G.D.: Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucl. Acids Res. 25, 3724–3732 (1997)
Held, M., Karp, R.: The traveling-salesman problem and minimum spanning trees: Part II. Mathematical Programming 1, 6–25 (1971)
Hofacker, I.L., Bernhart, S.H.F., Stadler, P.F.: Alignment of RNA base pairing probability matrices. Bioinformatics 20, 2222–2227 (2004)
Hofacker, I.L., Fekete, M., Stadler, P.F.: Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 319, 1059–1066 (2002)
Kececioglu, J., Lenhof, H.-P., Mehlhorn, K., Mutzel, P., Reinert, K., Vingron, M.: A polyhedral approach to sequence alignment problems. Discrete Applied Mathematics 104, 143–186 (2000)
Lenhof, H.-P., Reinert, K., Vingron, M.: A polyhedral approach to RNA sequence structure alignment. Journal of Comp. Biology 5(3), 517–530 (1998)
Mathews, D.H., Turner, D.H.: Dynalign: An algorithm for finding secondary structures common to two RNA sequences. J. Mol. Biol. 317, 191–203 (2002)
McCaskill, J.S.: The Equilibrium Partition Function and Base Pair Binding Probabilities for RNA Secondary Structure. Biopolymers 29, 1105–1119 (1990)
Sankoff, D.: Simultaneous solution of the RNA folding, alignment, and proto-sequence problems. SIAM J. Appl. Math. 45, 810–825 (1985)
Siebert, S., Backofen, R.: MARNA: Multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons. Bioinformatics (2005), (In press)
Washietl, S., Hofacker, I.L.: Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. J. Mol. Biol. 342(1), 19–30 (2004)
Waterman, M.S.: Consensus methods for folding single-stranded nucleic adds. In: Mathematical Methods for DNA Sequences, pp. 185–224 (1989)
Wolsey, L.A.: Integer Programming. Wiley-Interscience series in discrete mathematics and optimization. Wiley, Chichester (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bauer, M., Klau, G.W., Reinert, K. (2005). Multiple Structural RNA Alignment with Lagrangian Relaxation. In: Casadio, R., Myers, G. (eds) Algorithms in Bioinformatics. WABI 2005. Lecture Notes in Computer Science(), vol 3692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11557067_25
Download citation
DOI: https://doi.org/10.1007/11557067_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29008-7
Online ISBN: 978-3-540-31812-5
eBook Packages: Computer ScienceComputer Science (R0)