Abstract
Molecular Dynamics (MD) simulations can now predict ms-timescale folding processes of small proteins — however, this presently requires hundreds of thousands of CPU hours and is primarily applicable to short peptides with few long-range interactions. Larger and slower-folding proteins, such as many with extended β-sheet structure, would require orders of magnitude more time and computing resources. Furthermore, when the objective is to determine only which folding events are necessary and limiting, atomistic detail MD simulations can prove unnecessary. Here, we introduce the program tFolder as an efficient method for modelling the folding process of large β-sheet proteins using sequence data alone. To do so, we extend existing ensemble β-sheet prediction techniques, which permitted only a fixed anti-parallel β-barrel shape, with a method that predicts arbitrary β-strand/β-strand orientations and strand-order permutations. By accounting for all partial and final structural states, we can then model the transition from random coil to native state as a Markov process, using a master equation to simulate population dynamics of folding over time. Thus, all putative folding pathways can be energetically scored, including which transitions present the greatest barriers. Since correct folding pathway prediction is likely determined by the accuracy of contact prediction, we demonstrate the accuracy of tFolder to be comparable with state-of-the-art methods designed specifically for the contact prediction problem alone. We validate our method for dynamics prediction by applying it to the folding pathway of the well-studied Protein G. With relatively very little computation time, tFolder is able to reveal critical features of the folding pathways which were only previously observed through time-consuming MD simulations and experimental studies. Such a result greatly expands the number of proteins whose folding pathways can be studied, while the algorithmic integration of ensemble prediction with Markovian dynamics can be applied to many other problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dobson, C.M.: Protein folding and misfolding. Nature 426(6968), 884–890 (2003)
Karplus, M., McCammon, J.A.: Molecular dynamics simulations of biomolecules. Nat. Struct. Biol. 9(9), 646–652 (2002)
Faccioli, P., Sega, M., Pederiva, F., Orland, H.: Dominant pathways in protein folding. Phys. Rev. Lett. 97(10), 108101 (2006)
Voelz, V.A., Bowman, G.R., Beauchamp, K., Pande, V.S.: Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39). J. Am. Chem. Soc. 132(5), 1526–1528 (2010)
Levitt, M., Warshel, A.: Computer simulation of protein folding. Nature 253(5494), 694–698 (1975)
Tapia, L., Thomas, S., Amato, N.M.: A motion planning approach to studying molecular motions. Communications in Information and Systems 10(1), 53–68 (2010)
Amato, N.M., Song, G.: Using motion planning to study protein folding pathways. J. Comput. Biol. 9(2), 149–168 (2002)
Hosur, R., Singh, R., Berger, B.: Sparse estimation for structural variability. Algorithms Mol. Biol. (2011)
McCaskill, J.: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29, 1105–1119 (1990)
Ding, Y., Lawrence, C.E.: A bayesian statistical algorithm for RNA secondary structure prediction. Comput. Chem. 23(3-4), 387–400 (1999)
Turner, D.H., Mathews, D.H.: NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic Acids Res. 38(Database issue), 280–282 (2010)
Wolfinger, M.T., Andreas Svrcek-Seiler, W.A., Flamm, C., Hofacker, I.L., Stadler, P.F.: Efficient computation of RNA folding dynamics. Journal of Physics A: Mathematical and General 37(17) (2004)
Tang, X., Thomas, S., Tapia, L., Giedroc, D.P., Amato, N.M.: Simulating RNA folding kinetics on approximated energy landscapes. J. Mol. Biol. 381(4), 1055–1067 (2008)
Mamitsuka, H., Abe, N.: Predicting location and structure of beta-sheet regions using stochastic tree grammars. In: ISMB, pp. 276–284 (1994)
Chiang, D., Joshi, A.K., Searls, D.B.: Grammatical representations of macromolecular structure. J. Comput. Biol. 13(5), 1077–1100 (2006)
Kato, Y., Akutsu, T., Seki, H.: Dynamic programming algorithms and grammatical modeling for protein beta-sheet prediction. J. Comput. Biol. 16(7), 945–957 (2009)
Tran, V.D., Chassignet, P., Sheikh, S., Steyaert, J.M.: Energy-based classification and structure prediction of transmembrane beta-barrel proteins. In: Proceedings of the First IEEE International Conference on Computational Advances in Bio and medical Sciences (ICCABS) (2011)
Waldispühl, J., O’Donnell, C.W., Devadas, S., Clote, P., Berger, B.: Modeling ensembles of transmembrane beta-barrel proteins. Proteins 71(3), 1097–1112 (2008)
Waldispühl, J., Steyaert, J.M.: Modeling and predicting all-alpha transmembrane proteins including helix-helix pairing. Theor. Comput. Sci. 335(1), 67–92 (2005)
Waldispühl, J., Berger, B., Clote, P., Steyaert, J.M.: Predicting transmembrane beta-barrels and interstrand residue interactions from sequence. Proteins 65(1), 61–74 (2006)
Cowen, L., Bradley, P., Menke, M., King, J., Berger, B.: Predicting the beta-helix fold from protein sequence data. J. Comput. Bio.l, 261–276 (2001)
Ding, Y., Lawrence, C.E.: A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 31, 7280–7301 (2003)
Cheng, J., Baldi, P.: Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics 8, 113 (2007)
Zemla, A., Venclovas, C., Fidelis, K., Rost, B.: A modified definition of sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 34(2), 220–223 (1999)
Moulton, V., Zuker, M., Steel, M., Pointon, R., Penny, D.: Metrics on RNA secondary structures. J. Comput. Biol. 7, 277–292 (2000)
Song, G., Thomas, S., Dill, K.A., Scholtz, J.M., Amato, N.M.: A path planning-based study of protein folding with a case study of hairpin formation in protein G and L. Pac. Symp. Biocomput., 240–251 (2003)
Hubner, I.A., Shimada, J., Shakhnovich, E.I.: Commitment and nucleation in the protein G transition state. J. Mol. Biol. 336, 745–761 (2004)
Fulton, K.F., Devlin, G.L., Jodun, R.A., Silvestri, L., Bottomley, S.P., Fersht, A.R., Buckle, A.M.: PFD: a database for the investigation of protein folding kinetics and stability. Nucleic Acids Res. 33(Database issue), D279–D283 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shenker, S., O’Donnell, C.W., Devadas, S., Berger, B., Waldispühl, J. (2011). Efficient Traversal of Beta-Sheet Protein Folding Pathways Using Ensemble Models. In: Bafna, V., Sahinalp, S.C. (eds) Research in Computational Molecular Biology. RECOMB 2011. Lecture Notes in Computer Science(), vol 6577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20036-6_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-20036-6_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20035-9
Online ISBN: 978-3-642-20036-6
eBook Packages: Computer ScienceComputer Science (R0)