Abstract
We consider the problem of exact learning of parameters of a linear RNA energy model from secondary structure data. A necessary and sufficient condition for learnability of parameters is derived, which is based on computing the convex hull of union of translated Newton polytopes of input sequences [15]. The set of learned energy parameters is characterized as the convex cone generated by the normal vectors to those facets of the resulting polytope that are incident to the origin. In practice, the sufficient condition may not be satisfied by the entire training data set; hence, computing a maximal subset of training data for which the sufficient condition is satisfied is often desired. We show that problem is NP-hard in general for an arbitrary dimensional feature space. Using a randomized greedy algorithm, we select a subset of RNA STRAND v2.0 database that satisfies the sufficient condition for separate A-U, C-G, G-U base pair counting model. The set of learned energy parameters includes experimentally measured energies of A-U, C-G, and G-U pairs; hence, our parameter set is in agreement with the Turner parameters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Andronescu, M., Bereg, V., Hoos, H.H., Condon, A.: RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinformatics 9, 340 (2008)
Andronescu, M., Condon, A., Hoos, H.H., Mathews, D.H., Murphy, K.P.: Computational approaches for RNA energy parameter estimation. RNA 16, 2304–2318 (2010)
Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M.: Sparse RNA folding: Time and space efficient algorithms. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009 Lille. LNCS, vol. 5577, pp. 249–262. Springer, Heidelberg (2009)
Bartel, D.P.: MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116(2), 281–297 (2004)
Bernhart, S.H., Tafer, H., Mückstein, U., Flamm, C., Stadler, P.F., Hofacker, I.L.: Partition function and base pairing probabilities of RNA heterodimers. Algorithms Mol. Biol. 1, 3 (2006)
Brantl, S.: Antisense-RNA regulation and RNA interference. Bioch. Biophys. Acta 1575(1-3), 15–25 (2002)
Burge, S.W., Daub, J., Eberhardt, R., Tate, J., Barquist, L., Nawrocki, E.P., Eddy, S.R., Gardner, P.P., Bateman, A.: Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 41(database issue), D226–D232 (2013)
Chitsaz, H., Backofen, R., Sahinalp, S.C.: biRNA: Fast RNA-RNA binding sites prediction. In: Salzberg, S.L., Warnow, T. (eds.) WABI 2009. LNCS, vol. 5724, pp. 25–36. Springer, Heidelberg (2009)
Chitsaz, H., Salari, R., Cenk Sahinalp, S., Backofen, R.: A partition function algorithm for interacting nucleic acid strands. Bioinformatics 25(12), i365–i373 (2009); Also ISMB/ECCB proceedings
Dirks, R.M., Pierce, N.A.: A partition function algorithm for nucleic acid secondary structure including pseudoknots. Journal of Computational Chemistry 24(13), 1664–1677 (2003)
Do, C.B., Woods, D.A., Batzoglou, S.: CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics 22, 90–98 (2006)
Dyer, M.E.: The Complexity of Vertex Enumeration Methods. Mathematics of Operations Research 8(3), 381–402 (1983)
Faber, C., Scharpf, M., Becker, T., Sticht, H., Rosch, P.: The structure of the coliphage HK022 Nun protein-lambda-phage boxB RNA complex. Implications for the mechanism of transcription termination. J. Biol. Chem. 276(34), 32064–32070 (2001)
Finger, L.D., Trantirek, L., Johansson, C., Feigon, J.: Solution structures of stem-loop RNAs that bind to the two N-terminal RNA-binding domains of nucleolin. Nucleic Acids Res. 31(22), 6461–6472 (2003)
Forouzmand, E., Chitsaz, H.: The RNA Newton polytope and learnability of energy parameters. Bioinformatics 29(13), i300–i307 (2013); Also ISMB/ECCB proceedings
Gibson, D.G., Glass, J.I., Lartigue, C., Noskov, V.N., Chuang, R.-Y., Algire, M.A., Benders, G.A., Montague, M.G., Ma, L., Moodie, M.M., Merryman, C., Vashee, S., Krishnakumar, R., Assad-Garcia, N., Andrews-Pfannkoch, C., Denisova, E.A., Young, L., Qi, Z.-Q., Segall-Shapiro, T.H., Calvey, C.H., Parmar, P.P., Hutchison, C.A., Smith, H.O., Craig Venter, J.: Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329(5987), 52–56 (2010)
Gottesman, S.: Micros for microbes: non-coding regulatory RNAs in bacteria. Trends in Genetics 21(7), 399–404 (2005)
Hannon, G.J.: RNA interference. Nature 418(6894), 244–251 (2002)
Honer zu Siederdissen, C., Bernhart, S.H., Stadler, P.F., Hofacker, I.L.: A folding algorithm for extended RNA secondary structures. Bioinformatics 27(13), i129–i136 (2011)
Huang, F.W.D., Qin, J., Reidys, C.M., Stadler, P.F.: Target prediction and a statistical sampling algorithm for RNA-RNA interaction. Bioinformatics 26(2), 175–181 (2010)
Mathews, D.H., Sabina, J., Zuker, M., Turner, D.H.: Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 288, 911–940 (1999)
McCaskill, J.S.: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29, 1105–1119 (1990)
Nussinov, R., Piecznik, G., Grigg, J.R., Kleitman, D.J.: Algorithms for loop matchings. SIAM Journal on Applied Mathematics 35, 68–82 (1978)
Rivas, E., Eddy, S.R.: A dynamic programming algorithm for RNA structure prediction including pseudoknots. J. Mol. Biol. 285(5), 2053–2068 (1999)
Rivas, E., Lang, R., Eddy, S.R.: A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more. RNA 18(2), 193–212 (2012)
Seeman, N.C., Lukeman, P.S.: Nucleic acid nanostructures: bottom-up control of geometry on the nanoscale. Reports on Progress in Physics 68, 237–270 (2005)
Seeman, N.C.: From genes to machines: DNA nanomechanical devices. Trends Biochem. Sci. 30, 119–125 (2005)
Simmel, F.C., Dittmer, W.U.: DNA nanodevices. Small 1, 284–299 (2005)
Staple, D.W., Butcher, S.E.: Solution structure and thermodynamic investigation of the HIV-1 frameshift inducing element. J. Mol. Biol. 349(5), 1011–1023 (2005)
Storz, G.: An expanding universe of noncoding RNAs. Science 296(5571), 1260–1263 (2002)
Tinoco, I., Borer, P.N., Dengler, B., Levin, M.D., Uhlenbeck, O.C., Crothers, D.M., Bralla, J.: Improved estimation of secondary structure in ribonucleic acids. Nature New Biol. 246(150), 40–41 (1973)
Venkataraman, S., Dirks, R.M., Rothemund, P.W., Winfree, E., Pierce, N.A.: An autonomous polymerization motor powered by DNA hybridization. Nat. Nanotechnol. 2, 490–494 (2007)
Wagner, E.G., Flardh, K.: Antisense RNAs everywhere? Trends Genet. 18, 223–226 (2002)
Waterman, M.S., Smith, T.F.: RNA secondary structure: A complete mathematical analysis. Math. Biosc. 42, 257–266 (1978)
Yin, P., Hariadi, R.F., Sahu, S., Choi, H.M., Park, S.H., Labean, T.H., Reif, J.H.: Programming DNA tube circumferences. Science 321, 824–826 (2008)
Zakov, S., Goldberg, Y., Elhadad, M., Ziv-Ukelson, M.: Rich parameterization improves RNA structure prediction. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 546–562. Springer, Heidelberg (2011)
Zamore, P.D., Haley, B.: Ribo-gnome: the big world of small RNAs. Science 309(5740), 1519–1524 (2005)
Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 9(1), 133–148 (1981)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chitsaz, H., Aminisharifabad, M. (2014). Exact Learning of RNA Energy Parameters from Structure. In: Sharan, R. (eds) Research in Computational Molecular Biology. RECOMB 2014. Lecture Notes in Computer Science(), vol 8394. Springer, Cham. https://doi.org/10.1007/978-3-319-05269-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-05269-4_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05268-7
Online ISBN: 978-3-319-05269-4
eBook Packages: Computer ScienceComputer Science (R0)